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ABSTRACT 


The  relationship  between  the  page  size,  program  behavior,  and 
page  fetch  frequency  in  storage  hierarchy  systems  is  formalized  and 
analyzed.  It  is  proven  that  there  exist  cyclic  program  reference 
patterns  that  can  cause  page  fetch  frequency  to  increase  signifi¬ 
cantly  if  the  page  size  used  is  decreased  (e.g.,  reduced  by  half). 
Furthermore,  it  is  proven  in  Theorem  3  that  the  limit  to  this 
increase  is  a  linear  function  of  primary  store  size.  Thus,  for 
example,  on  a  typical  current-day  paging  system  with  a  large 
primary  store,  the  number  of  page  fetches  encountered  during  the 
execution  of  a  program  could  increase  200-fo3d  if  the  page  size 
were  reduced  by  half. 

The  concept  of  temporal  locality  versus  spatial  locality  is 
postulated  to  explain  the  relationship  between  page  size  and  pro¬ 
gram  behavior  in  actual  systems.  This  concept  is  used  to  dev-  lop 
a  technique  called  the  "tuple-coupling"  approach. 

Consist  nt  with  the  results  above  and  by  generalizing  conven¬ 
tional  two-level  storage  systems,  a  design  for  a  general  multiple 
level  storage  hierarchy  system  is  presented.  Particular  algorithms 
and  implementation  technqiues  to  be  used  are  discussed. 
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CHAPTEB  1. 

INTRODUCTION  AMO  PLAM  OP  THESIS 

1.0 


The  priaary  goal  of  this  thesis  is  to  provide  insight 
into  and  shed  additional  light  on  several  key  probleas  in 

the  design  and  analysis  of  general  storage  hierarchy 
systeas. 

i 

U  1  5UfiUiSMSt  2£ 


The  iaportance  of  research  in  storage  hierarchy  systeas 
has  been  pointed  out  by  Prof.  p.  j.  corbatb  recently  in  the 
HIT  Project  HAC  Prograss  Report  ?m  (July  1971): 


"By  now,  it  has  bacoae  accepted  lore  in  the  coacuter 
systea  field  that  use  of  autoaatic  aanageaent 
algorithas  for  aeaory  systeas,  constructed  of 
several  levels  with  different  access  tiaes,  can 
a  significant  siaplification  of  prograaaing 
effort.  ...  Unfortunately,  behind  the  Bask  of 
acceptance  hides  a  worrisoae  lack  of  knowledge 
bahind  how  to  engineer  a  aultilevel  aeaory  systea 
with  appropriate  algorithas  which  are  Batched  to  the 
load  and  hardware  characteristics." 


On  aultiple  level  storage  hierarchies.  Prof.  J. 


H. 
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Siltzar  was  even  aore  explicit  (subject  notes  on 
"Inf oraation  Systeas",  HIT,  1972,  p.  4-5d)  : 


"In  interesting  probiea  arises  if  one  has  three  or 
aore  technologies  to  deal  with.  ...  The  probiea  of 
predicting  the  perforaancc  of  a  three  level, 
autoaatically  aanaged  systea  is  not  at  all  weil 
understood.  ...  Although  the  need  for  aore  than  one 
level  has  already  been  argued,  there  is  currently  no 
known  criterion  for  introducing  three,  four,  or  b 
levels  for  a  given  systea.  ...  Although  there  are  by 
now  aany  iapleaanta tions  of  two  level  meaory 
systeas,  the  dynaaic  aanageaent  of  a  three  or  aore 
level  aeaory  systea  is  such  an  uncharted  area  that 
there  do  not  yet  exist  exaaples  of  practical 
algorithas  which  one  can  exaaine." 


1.2  2e2£iU£  S2Si §  dfid  4££2J£ll£kl£ni§ 

The  specific  goals  and  acccaplishaents  of  this  thesis, 
which  are  further  elaoorated  later,  are: 

•  Analyze  the  affect  of  certain  paraaeters,  such  as 

page  size,  upon  the  perforaance  of  a  storage 

systea . 

•  Develop  a  concept  of  locality  based  upon  both 

spatial  ani  teaporal  adjacency  in  address 
reference  patterns  that  explains  certain  anoaalies 
discovered  in  actual  paging  systeas.  ^ 

•  Propose,  foraalize,  and  aeasure  the  perforaance  of' 

new  "spatial-reaoval"  storage  aanageaent 
algorithas,  in  particular  "tuple-coupling". 

•  Design  a  practical  algoritha  for  effective 
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aanageeent  o t  multiple  level  storage  mermen/ 
-y  itwas  an!  deeonetrate  its  effe ctiveness  unl»»i 
soae  sieulatad  systaa  loads. 


’•3  SlBitli  5Ull£ill££  2 L  IkfiSls 

The  key  plan  of  this  thesis  is  to  investigate  several 
scusial  pcohlaas  and  requirements  of  multiple  level  storage 
hierarchy  systeas.  Particular  areas  are  identified  and 
corresponding  theories  developed  and  proven.  A  new  and 
general  design  for  storage  hierarchy  systeas  is  also 
presented  and  evaluated.  Finally,  empirical  eeasurenents  are 
presented  to  validate  and  calibrate  the  overall  design  and 
specific  theoretical  conjectures. 

This  thesis  is  organizationally  divided  into  8 
chapters.  The  structure  can  he  best  introduced  by  outlining 
the  content  of  the  following  chapters  in  the  sections  below. 

1.3.1  Chapter  2:  Motivation  for  Storage  hierarchy  Systeas 

This  chapter  presents  a  perspective  on  the  storage 
hierarchy  problea  and  the  aotivation  for  such  systees.  It 
is  primarily  written  for  the  benefit  of  people  knowledgeable 
in  the  general  computer  field  but  who  are  not  especially 
etpecianced  in  storage  hierarchy  systees.  For  the  expert 
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rillit,  this  chapter  exposes  the  biases  ana  orientations  01 
tao  author  and  thus  sets  the  tone  for  the  remainder  oi  the 
thesis.  This  chapter  also  briefly  reviews  tne  history  oc 
rtsearch  10  storage  systeas  and  cites  nuaerous  references. 

1.3.2  Chapter  3:  Formalization  of  Storage  Hierarchy  Systems 

A  description  and  formalization  of  the  basic 
characteristics  of  storage  hierarchy  systeas  is  presented  in 
this  chapter.  This  is  followed  by  a  summary  and  critical 
aaaiysis  of  research  that  directly  relates  to  the  specific 
goals  of  this  thesis. 

1.3.3  Chapter  4:  A  Storage  Hierarchy  System 

In  this  chapter  the  key  concepts  of  the  proposed 
storage  hierarchy  system  are  presented  and  discussed.  The 
principle  and  novel  techniques  are  briefly  described  below: 

1.3.3.  1  Continuous  Hierarchy 

The  ratio  of  performance  between  adjacent  levels  is 
kept  moderate  (e.g.,  a  factor  of  100  or  less)  to  minimize 
discontinuities  or  awkward  special-case  algorithms.  This  is 
i.i  contrast  to  many  current  systems  with  inter-level  ratios 
of  1030  or  more. 
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1.3.  3.2  Shadow  Storage  and  Page  Splitting 

Inforeation  is  transferred  in  decreasing  smaller  size 
blocks  as  it  is  passed  up  from  low  performance  levels  or  the 
hierarchy  toward  tne  "reguest  generator"  at  the  uppermost 
level.  Thus,  the  information  that  is  finally  receiveu  by  the 
request  generator  has  left  a  "shadow"  behind  in  the  lower 
levels.  The  significative  and  rationale  for  this  technique  is 
further  elaborated  in  Chapter  6. 

1.3. 3. 3  Automatic  Management 

In  order  to  reduce  the  load  on  the  central  processor 
and  provide  for  more  efficient  and  parallel  operations,  the 
storage  management  function  will  be  distributed  and 
incorporated  into  the  storage  levels  (e.g.,  "intelligent" 
device  controllers  [  1  ],  etc .)  .  This  technique  also  reduces 
the  complexity  of  the  operating  system  software. 

1.3. 3. 4  Direct  Transfer 

Storage  transfers  between  two  adjacent  levels  need  nor. 
have  any  effect  upon  nor  require  the  assistance  of  any  ether 
levels  {e.g.,  there  is  no  need  to  move  information  from 
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level  a  to  level  1  and  then  frcn  level  1  to  level  n-1  it 
only  level  n  to  level  n-1  was  needed;  this  two  step  process 
is  often  required  on  conteaporacy  systems),  Direct  transfer 
is  accomplished  by  synchronizing  non-mechanical  storage 
devices  or  by  using  "rubber-band"  buffers  [ 33  ]  between 
alectr o-aechanical  storage  devices. 

1.3. 3. 5  Read  Through 

Storage  transfers,  as  noted  above,  are  only  Bade 
bstween  adjacent  levels  of  the  hierarchy,  such  as  froa  level 
n  to  level  n-1.  But,  each  level,  such  as  level  n-1,  can 
connect  its  input  bus  (froa  lover  level  n)  to  its  output  bus 
(to  higher  level  n-2|  so  that  the  data  can  be  read  through 
(i.e.,  transferred  to  level  n-2  while  being  stored  in  level 
n-1).  A  siailar,  though  specialized,  technique  is  already 
used  in  certain  systeas,  such  as  the  IBM  Systea/370  flodels 
135  and  165  cache  systeas  [52]. 

This  results  in  performance  siailar  to  a  direct 
connection  froa  each  level  to  the  request  generator  but  it 
provides  auch  more  control  in  the  storage  levels  and  a  auch 
siapler  structure. 
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1*3.3. 6  Store  behind 


By  using  the  excess  capacity  of  the  inter-level 
Manuels,  there  is  a  continual  tlow  of  altered  data  trom  tue 
uigner  levels  to  the  lowest  level  permanent  storage.  Thus, 
tie  actual  updated  information  is  stored  behind  (after)  the 
store  initiation  from  the  request  generator,  l’he  updated 
uforaation  is  propayated  down,  level  by  level.  Whenever 
lif onation  is  altered  at  a  particular  level,  it  is  tagged 
as  altered  and  is  scheduled  for  a  ‘'store  behind"  operation. 

1.3.4  Chapter  5:  Analysis  of  Page  Size  Considerations 

3ne  of  the  most  important  parameters  of  a  storage 
hierarchy  system  is  the  page  size  chosen  as  the  unit  of 
transfer  between  two  levels  of  the  hierarchy.  m  this 
ciapter,  the  factors  influencing  page  size  are  examined  from 
the  device  characteristics  viewpoint  and  the  program 
oehavior  viewpoint. 


3f  particular  concern,  it  has  been  noticed  by  Hatfield 
[47J  and  Seligman  [78]  and  formalized  in  Chapter  5  that: 


"There  exists  a  page  trace, 
PIFO-removal  or  LSU-removal 
systems,  S  and  S*,  with  page 
respectively,  sum  that  the 
frequency  fc'  to  t  exceeds  2." 


P,  and  demand-letch 
inter-level  storage 
sizes  N  and  h**h/2, 
ratio,  r,  of  fetch 
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This  result  runs  counter  tc  the  hoped  foe  behavior  or 
decreased  page  sizes  as  noted  by  Denning  [25]: 


"  •••  small  pages  permit  a  great  deal  or  compression 
without  loss  of  efficiency.  Small  page  sizes  will 
yield  significant  improvements  in  storage 
utilization  ..." 


In  this  chapter  the  significance  of  this  problem  is 
demonstrated  by  proving  that  even  11  well-behaved"  removal 
algorithms,  such  as  stack  algorithms  [b3],  are  not  immune  to 
this  adverse  performance  behavior.  Furthermore,  the  nature 
of  this  phenomenon  is  analyzed  and  bounds  on  its  behavior 
are  developed. 

1.3.5  Chapter  6:  Spatial  vs.  Temporal  Locality  flodel  of 
Program  behavior 

k  primary  rationale  for  hierarchical  storage  systems  is 
based  upon  the  "Principle  of  Locality".  Unfortunately,  this 
principle  is  still  a  poorly  understood,  or  at  least 
controversial,  phenomenon.  It  is  difficult  to  determine  the 
original  "discoverer"  of  this  principle  but  it  is 
interesting  to  note  that  its  definition  has  changed  in  time. 
Por  example,  Denning  [29,  p.J],  in  19bd  loosely  described 
locality  as: 


Stonge  Hierarchy  Systems 


17 


"the  idea  that  a  computation  will,  during  an 
interval  of  time,  favor  a  subset  of  the  information 
available  to  it." 

Later,  in  1  970,  Denning  [2b,  p •  1 8 0  J  ueimeu  it  more 
precisely  based  upon  the  concepts  of  "working  set"  ana 
"reference  density",  which  for  a  page  i  -at  time  *: 

a(i,k)  =  Pr[reference  r(k)=i], 

saca  that  R  (1c)  is  tae  ranking  of  all  n  pages  based  upon 
a  (i,  it)  ;  thus: 

"PHINCIPLE  OP  LOCALITY:  Ihe  rankings  H  (*)  are 

strict  and  the  expected  ranking  lifetimes  long." 

This  is  a  much  more  restrictive  definition  or  locality  than 
his  earlier  general  concept. 

In  fact,  many  current  storage  management  systems  were 
devised  first,  a  general  model  was  then  constructed  to 
iasctibe  the  system,  and  finally  a  "formal"  definition  oi 
locality  was  developed  to  be  consistent  with  the  storage 
management  model.  This  is  a  reasonable  stLateyy  as  long  a^ 
tae  underlying  concepts  of  "the  principle  of  locality"  are 
uot  lost  in  the  pro; 2  ss.  Unfortunately,  this  appears  to 
hava  happened  on  several  occasions.  In  particular,  most 
popilar  definitions  of  locality  tend  to  be  useless  toi 
analyzing  or  explaining  either  the  relationship  of  page  ».ize 
upon  program  behavior  or  tne  impact  of  generalizing  n  c. 


4 
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tvo-lavel  storage  systeas  to  aultiple  level  hierarchical 
storage  systeas. 

In  this  chapter  a  new  view  c£  locality  is  presented  (or 
an  all-view  resurrects!  since  it  most  closely  resembles  some 
oC  tha  very  early  descriptions  of  locality) .  In  particular, 
it  is  shown  that  tha  general  concept  of  locality  can  De 
subdivided  into  two  separate  factors,  temporal  locality  and 
locality.  These  concepts  are  defined  and  justified 
and  then  used  to  explain  soae  peculiar  phenomena 

("anoaalies")  observe!  in  actual  two-level  storage  systeas. 

By  aeans  of  address  traces  and  storage  systea 
siaplifications,  the  taaporal  and  spatial  locality  behavior 
of  actual  prograas  is  eapencally  aeasured.  These  results 
are  used  to  reinforca  an!  calibrate  the  storage  hierarchy 
systea  design  presental  in  Chapter  4. 

1  •  3 .  t>  Chapter  7:  Spatial  Removal  storage  Nanageaent 

Algorithas 

Various  nierarch/  storage  aanageaent  algoiithas,  such 
is  fetch  (e.g.,  doaani- fetch)  and  teapocal  reaoval  (e.g.. 
nrst-in  first-out  (FIFJ)  ,  least  recently  used  (LRU)  ,  etc.) 
mvo  neon  iovelopaj,  primarily  tor  two-lovel  hierarchies, 
mere  appear  to  uo  uo  spatial  reaoval  algorithas  described 
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th-  literature.  Based  upon  Chapter  b,  several  spatial 

cy 

algorithms  are  proposed  and  analyzed. 

It  is  also  shown  that  some  cf  the  problems  described  in 
Chapter  5  can  be  solved  by  spatial  removal  algorithms,  m 
particular,  Hatfield  noted  that: 

"as  yet  we  have  been  unable  to  prove  that  there  is  a 
replacement  algorithm  using  only  the  past  history  or 
page  reguests  which  cannot  generate  more  than  twice 
the  exceptions  with  half  size  pages." 

Ia  this  chapter  a  new  algorithm,  named  f uple-coupliqg.  is 
presented.  it  is  formally  proven  that  it  satisfies 
Hatfield's  requirements  above. 

Furthermore,  the  operational  oehavior  of  tuple-coupling 
is  analyzed  by  measuring  the  performance  of  actual  programs. 

1.3.7  Chapter  8:  Discission  and  Conclusions 


In  addition  to  i  general  summary  of  the  significant 
aspects  of  the  thesis,  this  chapter  also  outlines  important 
areas  for  future  research. 
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CHAPTER  2. 

THE  STORAGE  HIERARCHY  PROBLEK 

2.0  Introduction 

Phe  evolution  of  computer  systems  has  been  marked  by  a 
continually  increasing  demand  for  faster,  larger,  and  more 
economical  storage  facilities.  In  addition  to  the  obvious 
concern  for  better  performance,  the  organization  of  a 
computer  system's  storage  plays  a  key  r  role  in  program 
development  and  programmer  efficiency.  It  has  often  been 
claimed  that  "any  software  design  blunder  can  be  overcome  by 
adding  more  memory". 

It  lias  become  generally  recognized  that  the  conflicting 
requirements  of  high-performance  yet  low-cost  storage  may  be 
oast  satisfied  by  a  mixture  of  technologies  comning 
expensive  nigh-performance  devices  with  inexpensive 
lower- performance  uevices.  This  strategy  has  been  given 
S3  vara i  names,  such  as  "hierarchical  storage  system", 
"automatic  multilevel  storage  management",  "virtual  memory", 
and  the  inevitable  "virtual  memory  system  tor  the  automatic 
multilevel  management  of  a  hierarchy  cf  storage  devices", 
in  tais  tnesis  the  somewhat  shorter  term  storage  higragchy 
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££§&§!  will  be  used. 

Investigations  into  automated  storage  hierarchy 
techniques  can  be  traced  bacn  more  than  a  decade.  ir  we 
uera  to  include  manual  techniques,  we  would  fi  d  storage 
nierarchies  at  tne  very  dawn  of  the  “computer  age". 
Unfortunately,  there  are  still  many  unsolved  and  poorly 
understood  proolems.  This  situation  can  be  partly  explained 
by  the  fact  that  these  systems  tend  to  be  (1)  extremely 
complex,  (2)  ill-suited  to  most  conventional  analytical 
tecnigues,  and  (3)  deeply  influenced  by  tne  rapidly  evolving 
computer  technology  which  keeps  "changing  the  ground  rules" 
at  often  frighteniug  rates.  In  spite  of  these  challenging 
stumbling  blocks,  a  successful  storage  hierarchy  system  is 
so  important  to  the  future  usefulness  of  computer  systems 
that  we  cannot  afford  to  abandon  the  search. 

> 

2. 1  Storage  Hierarchy  jd jectives 

Before  delving  into  details,  it  is  worthwhile  to 
briefly  consider  the  needs  and  uses  fcr  an  effective  storage 
hierarchy. 
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2.1.1  System  Performance  and  Economics 

As  logic  tecanology  and  computer  arcm  tecture 
tacaniques  have  advaaced,  we  have  found  it  possible  to 
produce  systems  of  incredible  speed.  Such  systems  are  orteu 
rated,  rather  crudely,  in  terms  of  MIPS  (millions  of 
instructions  per  second).  Experimental  system  of  over  130 
MIPS  nave  been  developed  (e.g.,  ILLIAC  IV  and  CDC  STAH) . 
Even  '•conventional"  large-scale  systems  have  passed  the  5  or 
13  MIPS  mark  (e.g.,  CDC  7600  and  IBM  37Q/19S).  It  has  long 
baen  observed  that  the  input/output  (I/O)  requirements, 
especially  for  "secondary  storage",  of  a  conventional  system 
tend  to  be  strongly  related  to  the  processor's  speed.  In 
fact,  based  upon  several  empirical  measurements,  it  has  been 
postulated  that  a  computer  system  averages  1  bit  of  I/O  for 
every  instruction  executed  (this  is  often  referred  to  as 
Amdahl's  Constant  [ref]).  As  a  result,  many  of  these 
high-performance  systems  have  been  confronted  with  massive 
Dottleneck  problems  in  the  I/O  area,  especially  since  these 
I/O  demands  tend  to  occur  in  bursts.  An  effective  storage 
hierarchy  system  could  go  a  long  way  toward  reducing  this 
problem. 

At  the  other  end  of  the  spectrum  we  find  that  medium- 
and  low-cost  processors,  the  latter  are  usually  called 
fill 2 2® PHASES#  have  xade  substantial  advances  in  recent 
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y»dcs.  The  tars  can  be  quit*  aisUadit.,, 

processors  are  typically  hundred*  of  tiae*  ta*t*r  thau  ta* 
airly  commercial  coapjtar*  at  a  fcactioa  ot  the  coat 
tae  UNIVAC  l,  circa  1951,  could  perform  about  2000  12-diytt 
additions  per  seconl  whereas  conteaporary  aiat-cosputer* 
operate  at  around  1,)1J,000  5-digit  additions  ^r  swccud) . 
Althoiyh  these  aini-processors  aay  be  aidgetu  compared  to 
the  coaputationa  1  problaas  attacked  by  their  "oiy  blethers** 
dascribed  above,  they  are  aore  than  adequate  for  the  vast 
sajority  of  infoaatioa  processing  prebleas  which  have  eodest 
coaputational  requirements.  Due  to  technological  advances 

and  econosies  of  scale  resulting  froa  large-scale 

production,  soae  ainicosputers  are  available  for  less  than 
S2030  with  slightly  slaver  micro -computers  being  offered  for 
little  as  J66  [19],  la  spit4  of  these  technological 
advances,  these  processors  have  not  had  aucb  iapact  cn  aost 
infonation  systea  needs  due  to  the  continuing  econosic 
problem  of  producing  large  capacity  inexpensive  storage 
devices  even  at  the  aodest  pertoraance  required.  A  £66 
processor  is  largely  irrelevant  if  the  storage  costs  are  in 
the  £100,000  or  more  ranye.  By  developing  an  offectivo 
stonge  hierarchy  systea,  we  can  go  a  long  way  toward 
ticiuging  the  storage  costs  down  to  the  level  ot  thesv 
inexpensive  processors.  As  a  result,  a  tremendous  nuaber  of 
currently  known  technical  solutions  to  information 
processing  problaas  will  finally  becoae  economical. 


MefalCfcy  .* 


40UU«!«. 


>.)  iiafliiy  aal  i.tjfiu 


i*  »at*i  «*rh<rf  ta#  o t  a  cv*pet«r ** 

stonje  *iH4»  ha*  a  coaaileraaie  lapact  aroa  rto4raa 
4*viUpNat  4»4  pfojnn«r  *r t  icieacy.  fo  4  «iua(, 

t*l#  potential  increase  lit  4*  ooteiaej  lj 

ridisiaj  or  elisiaatiej  constraints  aorsail j  isyoied  b j  the 
storage  These  constraints  often  distract  the 

projrtaaer  to  the  attest  that  be  devote*  4  substantial 
4 1 9w at  ot  sis  t iso  to  overcome?  the  systea*s  Imitation* 
rataac  than  so«vm?  the  intrinsic  prcbleas.  5b ooaaa  (§0J 
sots!  that: 


•fhe  lnberest  error  contsst  ot  cose  progress  is 
claistd  to  bs  related  to  the  ««cm  sesory  capacity 
available.  The  theory  here  is  that  if  the  aeaory  is 
very  creaped,  the  software  enters  will  have  to 
resort  to  overlay!  and  other  coding  "tricks”  to 
squeeze  the  desired  function*  lot©  the  allocated 
aeaory  space.  It  is  accused  that  these  tricks 
introduce  great  coapleit'tf  end  are  the  seat  oi  saoy 
•rrors.  This  effect  ic  cited  by  desigeers  01 
airborae  coaput%n  efcvr#  the  allocation  of  aootber 
block  of  <u  of  aeaory  is  a  *«;J©r  tJecago  decision.” 


Por  eitaple,  tnw  projraaaer  often  has  to  worry  about: 
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2.1.2.  1  Programming  language  code  efficiency. 

if  a  higher-level  language  compiler  tends  to  produce 
programs  that  are  at  all  larger  than  those  produced  oy  a 
low-level  language  translator,  it  aay  he  necessary  tc  use 
the  low-level  languaga  to  conserve  storage.  Tais  constraint 
is  contrary  to  the  ganerally  accepted  fact  that  high-level 
languages  enhance  programing  productivity. 

2.1. 2. 2  Program  size. 


Por  any  specific  storaye  size,  there  are  programs  that 
cannot  he  easily  written  to  fit  into  that  size  constraint, 
fat,  programmers  freguently  try  -  with  considerable  effort. 

2. 1.2.3  Data  structures. 

The  programmer  is  often  faced  with  the  need  to  choose 
between  a  data  structure  representation  that  is  convenient 
to  use  and  another  representation  that  "saves  storage". 
Tais  saving  may  require  the  use  of  an  awkward  or 
unnecessarily  complex  data  structure 


representation. 
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2. 1.2. 4  Specific  equipment  characteristics. 

If  the  programmer  must  get  the  "most”  out  of  his 
storage  system  in  terms  of  capacity  and  performance,  he  may 
resoct  to  techniques  that  are  peculiar  to  his  specific 
storage  system  equipment.  If  the  equipment  is  changed, 
there  may  be  a  considerable  impact  upon  his  software. 

rfe  would  like  to  develop  storage  hierarchy  techniques 
that  eliminate,  automate,  or  at  least  minimize  the 
programming  problems  described  above. 

2.1.3  Integrate  New  Technologies  and  Applications 

Although  there  has  been  continual  evolution,  the  basic 
storage  device  technologies  in  commercial  use  have  not 
changed  dramatically  in  the  past  decade.  As  a  result,  tnere 
has  been  a  tendency,  motivated  by  actual  need,  to  relate 
applications  to  the  specific  available  technologies.  This 
has  caused  certain  application  areas  to  be  abandoned  as 
'•infeasible"  and  many  storage  management  strategies  to  be 
discredited  as  "irrelevant"  or  "inefficient".  In  the  passage 
of  tiie  we  remember  the  applications  and  tecnniques  in  use 
out  frequently  forget  or  ignore  the  alternatives  possible 
and  tne  reasons  for  bypassing  these  alternatives. 
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After  this  rather  long  "rest",  it  appears  that  we  are 
on  the  verge  of  some  major  "awakenings"  in  applications  aim 
tachnology.  It  is  hard  to  quantify  the  new  application 
naeds  other  than  requiring  more  and  faster  storage  for  less 
aanay.  Section  1.1,2  presents  seme  of  these  motivations,  the 
revitalized  interests  in  time-sharing,  artificial 
intelligence  and  automatic  programming  are  also  "fanning  the 
f ira". 


Due  to  the  uncertainty  of  advanced  research  in  storage 
davica  technologies,  it  is  difficult'  to  torsee  accurately 
which  of  the  many  ictive  efforts  will  succeed  (see  for 
example,  Ayling  [7],  Best  [15],  Bobeck  [16],  Camras  [17], 
Dali  !  24  J,  Fields  [35],  Gardner  [39],  Howard  [50],  HaticK 
[SHatick.],  Nyers  [69],  Rector  [74],  Shahbender  [79], 
Thompson  [85]).  Considering  the  technical  advances  clearly 
demonstrated  in  the  laboratory  and  the  driving  "profit" 
motivation,  it  is  reasonable  to  expect  some  dramatic  changes 
in  the  next  few  years.  Even  if  we  don't  know  what  or  when, 
«a  wojld  be  foolhearty  to  totally  ignore  this  situation. 

Table  1  below  mdicates  the  performance  and  price 
caacacteristics  of  typical  current-day  storage  technologies. 
The  two  entries  inarxed  by  question  marks  (?) ,  Bulk  Store  and 
Giant  Store,  indicate  new  technologies  that  have  already 
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Baaioa 

Maximum 

Access 

Transfer 

Device 

lime 

Bate 

Price 

Capacity 

— 

- 

Isecon  ds)_ 

l£Xte^sec]_ 

ilZkj(tel 

(bytes) 

1. 

Cache  Store 

1  • 6x 1 0-7 

1x10* 

8.8x10° 

1.6x10* 

(1BH  3165) 

(160  as) 

( 1 OOfl  b/s) 

($8.80) 

(16K) 

2. 

Hain  Store 

1.44x10-* 

1 . 6  x  1 0  7 

5x10-» 

5.  12x10* 

( 13 N  3360) 

(1.44  us) 

(16H  b/s) 

(5CX) 

(5 12K) 

3. 

Bulk  Store? 

1.3x10-* 

8x  10* 

8.8x10“* 

2x10* 

(ANS  SSU[  35  ]) 

(130  us) 

(8 K  b/s) 

(8.80) 

(28) 

4. 

Large  Store 

5x10~3 

1. 5x'i0* 

2.2x10-* 

1. 1x1 07 

(IBM  2305-2) 

(5  is) 

(1.5M  b/s) 

(2.20) 

(i  in) 

5. 

Hass  Store 

3.8x10-* 

8x10* 

4.5x10-* 

2x10* 

(IBH  3330) 

(38  as) 

(8  00K  b/s) 

(.0450) 

(200M) 

6. 

Siant  Store? 

6x10® 

6x10* 

2.2x10-* 

1.6x10»° 

(Gruaman 

(6  sec) 

(600K  b/s) 

(.00220) 

(168) 

HASSTAPE) 


Table  1. 

Representative  Storage  Hierarchy 
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oaen  placed  in  limited  use.  Since  these  two 
cost/performance  positions  were  net  part  of  our 

"traditional"  technologies,  we  are  faced  with  the  problem  of 
passible  modifying  aur  applications  and  developing  new 
strategies  to  efficiently,  efiectively,  and,  hopefully 
optimally,  integrate  them  into  our  overall  hieratcnical 
storage  system. 

As  the  entire  spectrum  of  computer  architectures,  as 

well  as  storage  device  technologies,  undergoes  reshufflings, 

bath  avolution  as  well  as  revolutions,  it  is  worthwhile  to 

raview  and  reconsider  our  current  concepts  on  storage  system 

Jasign.  Tanle  1,  although  a  simplified  summary  of  current 

storage  technologies,  illustrates  the  fact  that  there  exists 
* 

a  spectrum  of  devices  that  span  about  6  orders  of  magnitude 
of  price/performance  ( 100, 00C, 000*)  -  This  is  guite 
significant  in  the  light  of  the  excitement  that  normally 
accompanies  an  improvement  of  10-20*  in  performance  or  a 
dacreise  of  10-20*  in  price  in  current-day  systems.  The 
participants  in  this  "storage  sweepstakes"  may  change  in 
tine,  but  with  such  large  price/perf or mance  stakes,  there 
will  be  continuing  benefits  to  "playing  the  game"  better. 
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2.1.4  Understanding  of  Program  and  System  Behavior 

is  noted  earlier,  the  detailed  operational  behavior  ot 
computer  systems  is  often  extremely  complex.  Tnus, 
decisions  on  hardwars,  software,  and  system  design  must 
often  be  made  in  spite  of  insufficient  Knowledge,  A  better 
understanding  of  prograa  and  system  behavior  is  essential  to 
the  intelligent  and  efficient  development  of  future  systems. 

It  is  hoped  that  the  research  tc  be  conducted  as  part 
of  this  thesis  will  shed  considerable  light  cn  these 
matters. 

2. 2  ttiSEkESkl  iPPEfiaches 

■Storage  hierarchy  system"  and  similar  terms  have  been 
used  in  many  contexts.  Consistent  with  the  objectives 
outlined  in  the  pravious  section,  certain  particular 
contexts  are  assumed  in  this  research. 

2.2.1  Spectrum  of  Approaches 

The  problems  ot  storage  hierarchy  management  have  been 
attacxed  by  a  host  of  approaches.  He  can  loosely 
characterize  these  efforts  into  three  categories: 
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2>2.1.i  Manual  Hierarchy  Haoageaent 

liven  a  specific  enseable  of  storage  aevice 
tschnologies,  after  considerable  thought  the  programmer  can 
axplioitly  or  iaplicitly  specify  how  his  intoraaticn  (i.e., 
prograas  and  data)  should  be  organized  and  distributed 
within  the  hierarchy  and  hov  and  when  his  inforaation  should 
be  re-arranged.  Having  deterained  the  distribution,  he  aust 
also  specify  his  access  to  specify  inforaation  accordingly. 

ihen  a  prograaaer  is  directly  operating  upon  his 
inforaation  at  the  lowest  level  (e.g.,  using  aachine 
language,  direct  1/3  reguests,  etc.) ,  he  is  explicitly 
controlling  the  storage  hierarchy,  this  is  explicit  aanual 
&LS£i££h£  aaflageaeQt.  In  aost  conventional  systeas,  the 
prograaaer  coaaunicates  with  the  systea  via  prograaaing 
languages  and  control  cards.  Although  this  can  relieve  auch 
of  tha  tedious  or  intricate  details  of  storage  aanageaent, 
the  overall  control  of  the  storage  hierarchy  is  still 
priiarily  the  responsibility  of  the  prograaaer.  This  is 
1! Elicit  ainuai  lanageaent . 

Manual  storage  aanageaent  can  be  very  ecouoaical  since 
it  usually  requires  no  special  hardware  features  nor  special 
system  software.  Furthermore,  it  places  the  control  of  the 
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storage  hierarchy  in  the  hands  of  the  programmer  wno  is 
pres usably  the  oue  moat  faaiiiar  vith  his  needs.  rtanuai 
storage  management,  ia  its  aany  manifestations,  is  the  most 
coma on  storage  hierarchy  approach  in  use  touay. 

Manual  storage  aanageaent  has  aany  disadvantages, 
though.  The  aaouQt  of  detail  that  tae  programmer  oust 
understand  and  use  can  add  significant  complexity  to  this 
tasx.  This  then  introduces  additional  areas  of  error  and 
decreased  productivity.  Furthermore,  the  assuaption  that 
the  prograaaer  is  the  best  judge  of  optiaal  storage 
organization  is  often  wrong.  The  complexities  and  dynaaics 
coaaon  to  modern  systems  are  often  beyond  the  understanding 
of  aost  application  prograaaers. 

Multiprogramming,  an  almost  universal  technigue  in 
current  systens,  necessitates  strategies  for  global 
optiaization  which  usually  differ  substantially  froa  the 
individual  local  optimizations  of  each  program.  For  these 
reasons  there  has  been  continual  search  for  "a  better  way'1. 

2.2. 1.2  Seai-Autoaatic  Hierarchy  Management 

Many  techniques  have  been  developed  to  ainiaize  the 
amount  of  effort  reguired  of  the  programmer  and  to  provide 
fee:  bacK  to  him.  The  programmer  still  has  tne  ultimate 
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;Mt"‘  10  suc1'  1  sattnuuiiis  tuuiGtx  uauutu 

iystea . 


Certain  of  these  technigues  are  baseo  upon  tne  concept 
»£  tne  progra.aer  providing  -hints”  to  the  syste..  „,esc 
hints  for.  the  basin  for  a  partially  aoto.atea,  partially 
.anual  storage  .anage.ent  syste..  Although  not  especially 
widespread,  this  approich  has  been  used  in  several  syste.s 
(e.g.,  Jensen  el  al  [53],  O'Neill  et  £1  [70],  etc.). 


If  there  is  a  single  application  that  is  guite  large 
and  co.plar,  technigues  have  been  developed  to  analyze  the 
actual  perfor.ance  and  provide  feedback  to  the  prog.  i,»er. 
This  approach  is  prUarily  used  in  specialized,  dedicated, 
predictable,  high-perf or.ance  syste.s,  such  as  an  airline 
reservations  syste..  Numerous  atte.pts  have  been  reported, 
each  as  irora  et  |1  [5],  Ba.a.oorthy  et  al  [ 71  ],  etc. 


rhe  various  se.i-auto.atic  hierarchy  .anage.ent 
approaches  help  to  reduce  the  progra..er's  effort  and  to 
attain  a  better  local  opti.ization.  Although  useful  tor 
-'attain  applications,  these  strategies  do  not  re.o.e  the 
disadvantages  already  noted  with  .anual  hie,archy  .anage.c  , 


oy  stea  s. 
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ry  2. 2. 1.3  Automatic  Hierarchy  Management 

Certain  aspects  of  logical  information  or ganizaticn  ace 
inherent  in  a  programmer's  basic  algorithm.  In  an  automatic 
!lL££&££&£  aanageaent  system,  all  aspects  of  tne  physical 
information  organization  and  distribution  that  are 
irrelevant  to  the  underlying  logical  structure  should  be 
removed  from  the  programmer's  responsibility.  The 
programmer  may  wish  to,  maybe  even  be  encouraged  to,  use 
algorithms  that  are  Known  to  perform  well  in  conjunction 
with  the  automated  hiararchy  management.  But,  the  central 
responsibility  of  tha  storage  hierarchy  management  is 
removed  from  the  programmer. 

Since  this  approach  directly  focuses  on  the  storage 
hierarchy  objectives  presented  earlier,  it  will  be  the 
primary  approach  to  be  pursued  in  this  thesis. 

2.2.2  Spectrum  of  Analysis  Efforts 

Bach  of  the  storage  hierarchy  approaches  mentioned 
above,  primarily  semi-automatic  and  automatic,  have  neen 
subjected  to  various  forms  of  analysis.  In  this  section  we 
briefly  outline  the  principal  deficiencies  of  these  efforts. 
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2*2.2, I  Generalized  Models 

Due  popular  fora  of  analysis  is  to  assume  a  generalized 
oael  for  hardware,  sottware,  and  system  behavior.  If  ono  is 
careful  in  choosing  tne  characteristics  cf  the  model  (e.y,f 
Poisson  arrival  and  service  tiaes,  etc.),  it  is  possible  to 
develop  precise  analytical  solutions.  Unfortunately,  it  is 
usually  difficult  to  validate  these  models  except  for  rather 
simple  solutions.  Furthermore,  since  there  are  few  truly 
automatic  storage  hierarchy  systems  in  general  use,  it  is 
extremely  difficult  to  even  determine  realistic  parameters 
for  tnese  generalized  models  even  if  the  models  were  valid. 

generalized  models  have  been  reported  in  several 
papers,  such  as  Aho  et  al  [2]  and  Denning  (25]  in  the 
Bibliography. 

2 ,2,2,2  Constrained  Models 

Another  variation  on  the  generalized  model  scheme  is  to 
analyze  a  particular  program  and  then  model  its  relationship 
tc  the  rest  of  the  system.  There  are  at  least  two 
shortcomings  in  this  approacn.  First,  as  in  tne  generalized 
model  case,  it  is  difficult  to  realistically  model  tne 
relationsnip  between  a  program  and  the  rest  of  the  system. 
3acond,  the  analysis  and  measurement  of  the  particuift! 
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program  is  normally  converted  into  scae  form  of  probability 
matrix  or  probabalistic  reference  pattern.  In  eitner  case, 
significant  effort  is  required  to  accurately  measure  tne 
prograa*s  behavior.  Furthermore,  the  probabalistic 
characteristics  are  usually  aggregated  to  reflect  the 
overall  behavior  of  the  program  and,  as  a  result,  the 
dynamic  nature  of  the  prograa  and  its  inpact  on  the  storage 
hierarchy  are  often  lost. 

Exaaple  analyses  of  constrained  models  can  be  found  in 
references:  Arora  and  Gallo  £5],  Hatfield  and  Gerald  £47], 
Lewis  and  Yue  £60],  and  Baaanoorthy  and  Chandy  £72]. 

2. 2. 2. 3  Limited  Environment 

A  common  deficiency  of  most  previous  research  is  that 
only  a  limited  environment  was  considered,  in  particular 
automatic  hierarchy  management  over  only  tvo  levels  using  a 
single  page  size.  Of  course,  most  current-day  computers  have 
only  employed  automatic  hierarchy  management  in  either  Cache 
Systems  (cache  store  -  main  store)  or  Paging  Systems  (main 
store  -  large  store).  Unfortunately,  there  is  definite 
reasons  to  believe  that  many  of  the  conclusions  and 
tecanigues  demonstrated  for  a  two-level  hierarchy  do  not 
necessarily  generalize  to  handle  the  spectrum  of  program 
detail  and  device  cnaracteristics  encountered  in  a  truly 
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Multiple  level  storage  hierarchy.  furthermore,  sany  or  the 
papers  that  atteaptel  to  investigate  general  storage 
hierarchies  assuaed  techniques  and  approaches  that  are 
priaarily  based  upon  two-level  hierarchy  assumptions. 

This  liaited  environment  has  been  studied  by  nuacrous 
people,  such  as  iho  Sift*  £2),  Belady  *1  (10,11,12), 

Coffaan  and  Variau  £13,86),  Conti  *1  (21, 22 J,  Leaning 
-  25 J,  Potheringhaa  (33),  Guertin  (45),  Kilburn  ai  **  L  57  ), 
Mattson  fii  *1  (63),  Saligeau  (78),  Seith  (81),  ana  tfiUe* 
(&8). 


2.2. 2. 4  General  Hierarchy  Cnvironaent 

The  studies  of  liaited  two-ievel  storage  hierarchies 
hsve  been  quite  successful  in  eaey  actual  systeas.  4 
reasonable  strategy  would  be  to  extend  these  techniques  to  a 
aore  general  storage  hierarchy  environment.  There  have  been 
a  fee  attempts  along  these  lines,  but  as  aeoticnwd  in  the 
previous  section,  aost  were  haapereJ  by: 

(1)  atteapting  to  directly  apply  two-level  hierarchy 
techniques  without  carefully  considering  their 
applicability, 

(2)  atteapting  to  generalize  techniques  which  were  not 


oven  tuliy  understood  in  a  two-level  eavirooawnt. 
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r he  major  thrust  of  this  thesis  is  to  provide  iusignt 
iato  and  shad  additional  light  on  these  problems. 
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CHAPTER  3. 

FORMALIZATION  OF  STORAGE  HIERARCHIES  AND  R1LAIID  RES  iARCii 


3.0  la  troductjon 


In  this  chapter  a  f ormalizat ion  ot  the  Key 
characteristics  of  storage  hierarchies  is  presented  and 
parforaance  Measures  are  derived.  The  reported  performance 
of  actual  systems  is  reviewed. 

3* 1  Parameters  q£  a  General  Storage  Systea 


Table  2  and  Figure  1  illustrate  the  major  paraaeters  of 
a  storage  hierarchy  systea.  These  parameters  can  be  grouped 
into  four  categories:  (1)  basic  technology,  (2) 
configuration,  (3)  algorithm,  and  (4)  program  behavior. 

3.1.1  Basic  Technology 


The  basic  technology  parameters,  cost/byte. 
access  tj.ae,  T,  are  primarily  dependent 
physical  properties  ot  the  storage  device  technolog 
given  time  the  state-of-the-art  ofrers  only  a  limit 
of  (C , T)  alternatives  that  the 


C ,  ana 
upon  the 
y.  At  any 
ed  numoei 


system  designer  can 


select. 
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(ttz) 

Basic  Technology 

•  C  cost/byte 

•  T  average  access  time 

=.2  [it  igur  at  ion 

•  L  uumber  of  levels 

•  I  interconnection  of  levels 

•  S  size  (capacity) 

•  B  transfer  rate  (bandwidth) 

•  N  nusber  of  nytes  an  page  (page  size) 

Program  Behavior 

•  A  address  trace 

ild2Ei t hm 

•  F  fetch 

•  P  placement 

•  H  replacement 

Table  2. 

Major  Parameters  of  a  Storage  Hierarchy  System 


Stor*v*»  Hierarchy  Syst «ss 


level  1 


Live!  2 


Level  3 


Levs!  L 


N> 


#«^uvst 

Generator 

(Processor) 

A*!*1. 


ti') 


(C‘,S») 


tit 


B») 


(C»,S»> 

W 

•  •As 

mm 


Kijure  1. 

structure  ot  i  storage  Hierarchy  Systea 
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J.  1.  2  Coat iquratioa 

rhe  systea  dasiqaer  iota  have  tleiioility  in  crqaiimn* 
oc jo  storage  davicaa.  ay  serial  and/ar  parallel  atructunng 
of  the  coeponents  ot  a  given  level  or  storage  device 
ticaealogy,  it  is  possible  to  specify,  over  a  vide  tenge  ot 
vilues,  the  mg  (storage  capacity) ,  S,  and  the  HftilMC 
[llf  (data  baodvidth),  D,  cc  the  systes.  For 
aiaaple,  it  a  particular  technology  provides  a  tesic  device 
vith  3-*  and  B»b,  connecting  n  ct  these  devices  m  parallel 
produces  a  storage  levil  vita  SBas  and  B«nb.  (To  soae  eitent 
the  sachanlsa  and  cost  ot  the  organizational  structure  does 
Utlueacw  the  overall  zost/oyte  and  avnrage  access  tiae  ot  a 
lavtl,  this  eftect  is  usually  aieiaal  for  saall  values  ot 
o)  . 

jo  a  aore  global  basis,  the  designer  aust  deteraioc  the 
liable  8l  ISXlll#  L»  10  tbc  Storage  systea,  the 
Uisc«Ma2£iiaai  ai  ut  IisirlE#  1 *  *nd  tb*‘  9l**»  *•  01  d 

(the  unit  ot  tnfaraation  so  ved  betveen  levels). 

1.1. 3  Progrta  behavior 

The  £L2££S£2L»  under  pro  91  as  control,  produces  a 
sequential  senes  ot  ntereocer.  to  the  storage  systea.  tneso 
prozedsor  refereacus  ire  in  the  fora  ot  l£il££i  adute»t» 
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CiltttBSSf  **ick  Hit)  to  uniquely  identity  eaco  inaivudji 
unit  or  stored  lot  imitation  (e.g.,  «n  d-oit 
independent  of  its  location  (i .e. ,  H»,  .i*,  i*,  ...).  t»,e 

list  sejuonce  of  logical  address  references,  a ,  is  called  an 

ilietlt  UlSS  or  iii££ii  £2i«£jifl£2  £Blli£ft*  i«  general,  eawn 
aai^ijA  progrea  and  iti  input  data  Kill  result  in  a  aiffoiont 
processor  address  tn:«,  for  purpose*  of  analyzing  tne 
tff  f e.'t  ts  of  toy  i»t oca  g@  hierarchy,  the  address  tuc«  u» 

to#  prisery  characterization  ot  a  progras  tbit  is  needed 
(*•?•«  vc  don't  c at o  vhat  too  program's  purpose  is  or  vnat 
language  it  is  written  in,  «tc.,  ve  cnly  cor#  ooout  its 
il dross  trace),  Thus,  the  address  trace  describes  the 
£t&i£lllt  fc*BlIifi£  as  observed  by  the  storage  hierarchy. 

i.  I.e  Algontns 

There  are  three  bisic  decision  algontoss  that  oust  be 
deployed  by  an  autoeatic  storage  sanagcaont  s/steo. 

K,  decides  when  and  vmeu  mtorsatlco  mould  be  soved  up  a 
l«vel  (e.g,,  fro*  N*  to  .1*).  Llsl2£B2fi£#  i'.  decides  wnoro 
iafoia  atl  on  should  be  placed  in  a  level.  Besova^  or 
£.{£il£2i£US#  ><»  decides  when  and  which  ioforsaticn  sheuid  bo 
transrerrod  down  a  level  (e.g.. 


troa  rt»  to  ft*) . 
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k  coapiataly  genaral  storage  hierarchy  aiyontba,  h, 
ajst  consider  ail  the  paraaeters  described  above: 

H  ■  t  (<Technology>.  <Configuraticn>,<?cograa>,<Algoritha>) 

H  «  f(<C,T>,  <L.  I#  S,  8 .  N>,  <A>,  <F,  P,  h>) 

Clearly,  atteapting  to  optiaize  a  systea  with  sc  aany 
piraaaters  is  difficult.  Fortunately.  it  is  possible  to 
eliaiaate  troa  concern  or  at  least  siapiity  certain 
paraaeters  as  explained  below. 

3.2.1  Configuration 

Consistent  with  the  title  of  this  thesis,  we  shall 
consider  only  hierarchical  interconnections  of  levels  as 
illustrated  iu  Figure  1.  where  T*<T*<T*<  etc.  and  IM<ll*<ll»< 
ate.  The  rationale  for  this  docision  is  elaborated  in  the 
thesis. 

There  are  three  basic  strategies  for  inforaation 
aoveaant  sizes:  (1)  select  a  single  page  size  value.  N. 
which  is  always  used  throughout  the  hierarchy  -  this 
approach  is  used  on  lost  conteaforary  autoaatic  aultilevel 
storage  systoas  (e.g.,  Huitics),  (2)  allow  an  arbitrary 
range  of  values  for  N  to  be  used  -  this  approach  is 
priaarily  used  on  aanually  naaaged  storage  systeas.  and  (3) 


Stonge  hierarchy  Systems 


4b 

I*  values  or  S,  a  specific  unit  or  transfer  is  j„»*j 
oetween  any  two  levels  of  the  hierarchy  -  this  ap^roacn  •iti 
b»  pursued  ani  justified  in  this  thesis. 

1*2,2  Program  Behavior 

Each  logical  address  can  be  represented  as  a  bits  as 
snovn  in  Pigure  2(a),  If  tne  page  sizes,  h,  are  chcsen  to 
h»  posers  or  2,  the  sat  of  2**a  possible  addresses  can  be 
partitioned  into  2**p  pages  of  N  =  2 • *n  consecutive  logical 
addresses  each  as  shown  in  Figure  2(b).  [Note:  the  notation 
•*2*«aN  aeans  2  raised  to  the  power  a  J.  Since  the  information 
movement  between  storage  levels  is  accomplished  by 
transferring  pages,  we  can  analyze  this  interlevel  movement 
by  aeroly  considering  the  time  sequence  of  logical  pages 
references,  Ap,  called  a  gage  trace. 

Since  we  allow  the  page  size  to  be  different  between 
each  level  aod  requests  are  only  passed  down  to  a  given 
level  if  they  cannot  be  satisfied  by  any  higher  level,  each 
level  will  usually  experience  a  different  page  trace  though 
ail  are  algorithmically  derivable  from  the  same  address 
tease,  In  fact,  it  all  address  references  were  broadcast,  to 
all  storage  levels,  the  page  traces  can  be  determined  by  a 
simple  mapping  from  logical  addresses  into  logical  pages: 
page  audrass  =  integer (  logical  address/N  ) 
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(t2) 


(a=p*n ) 

(b)  Logical  Address 

(Divided  into  Page  Address  and  Displaceaent) 


Figure  2. 

Fornat  of  Logical  Address 


5* 


■4 


%  / 

otorage  Hierarchy  iysteas  u  7 

*  % 

« 

e  # 

vaere  Jl  is  the  page  sic*  for  that  level. 
j.  2.  3  Algorithm 

rv 

The  placeaont  decision,  ?,  is  usually  unconstca n.i 4  or 
ainleally  constraiuei  and,  as  a  result,  has  relatively 
litti*  iapact  upon  pecforaance. 

*  liiaai  taifib  policy  will  b«  usod.  Assuau  tuat  at  tiac 
t  a  request  tor  logical  address  a  (or,  equivalently , 
p‘ "Ihteger (a/N‘ )  )  arrives  at  level  R».  At  that  lustant  the 
inCoraation  aay  currently  reside  in  R»,  otherwise  it  aust  no 
found  in  a  lover  level.  Under  deaand  fetch,  if  p»  is  in  fl * , 
the  rafereuce  proceels,  the  intcraation  is  passed  bach  to 
the  processor,  and  no  other  page  aoveaent  occurs  iu  the 
hierarchy.  It  p*  is  not  m  r. 1 ,  a  request  for 

P« *lat«ger (a/N*)  is  saut  troa  R‘  to  fl*.  It  p*  ic  in  fl*,  the 
page  is  transtorred  to  fl 4  and  processing  continues  as 
lascnbed  above,  otherwise  a  request  for  p  >«iateger  (a/h *)  i~> 
s»nt  froa  H*  to  fl*,  ate.  Note  that  under  the  deaand  teten 
policy,  inforaation  is  only  aoved  up  in  the  hierarchy  whet, 
and  it  it  is  explicitly  dg§ja<j£<j  (i.e.,  requested)  by  th« 
processor. 

Although  Joaand  tetch  is  only  one  possible  tetet 
algontha,  it  can  ba  shown  ini]  that  for  oier archical  1  > 


structured  storage  systeos: 
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"...  given  any  trie*  anJ  replacesent  aigontnu  (not 
nicossar iiy  using  Jesand  paging)  ancther  tcplaceaent 
algoritns  exist  that  uses  dcsand  pacing  auo  Cause* 
too  saa«  or  tewa r  total  nusoet  ot  pages  to  bi 
transferred  ..." 

In  otner  words,  as  you  si ght  intuitively  suspect,  sovmg 
pages  only  when  necessary  results  in  the  sioiaal  nusoer  ot 
pigc  aovwsants.  Ot  coarse,  it  pag*  ecvcewnt  is  retired  ««»d 
tie  higher  level  that  is  to  receive  the  page  is  already 
full.  t*»e  reaoval  algoritha  aust  te  esployei  t©  provide 
space  tor  the  new  page. 

J.  2.«  xevised  Storage  Hierarchy  nodal 

based  upon  the  discussion  afccvc,  w«  can  sligstay 
aiaplify  the  paraseters  retaining  £ct  consideration  m  the 
storage  hierarchy  algonths,  II.  so  that  it  need  voneidei 

oily: 

H  ■  f (<Technology>,<-outiguration>,<Progras>,<Algoritb*>) 

tl  ■  f(<C,  r>,  <L,S,b,!t>,  <A>,  <f>) 

U  this  thesis  ail  o£  these  patasetert  will  oe  considered 
and  investigated.  Special  eaphasis  will  oe  pla,  oj  >t» 
analyzing  and  understanding  the  relationship  between  u« 
pages  sizes,  .i,  and  the  tesovai  aljoritne,  8,  required  lot 
efficient  operation  ot  the  storage  hierarchy. 
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Thar#  lit  various  petforaance  aeasures  that  vu  could 
ooiilac.  Poc  ao  overall  point  of  viev,  syste.  M«,UUllt# 
•  ich  as  Jot,  throughput ,  Job  turn-arcund  tiae,  and  processor 
utilization,  ara  guita  significant.  Unfortunately,  it  x* 
eatreiely  difficult  to  directly  relate  these  aeasures  to  the 
perforaance  of  the  atoraye  systea,  even  an  approbation 
ould  require  consideration  of  .any  .ore  paraaeters.  ihus, 
e»  eill  only  consider  aeasures  that  relate  to  the  effective 
petforaance  of  the  storage  hierarchy. 

J. 3. 1  Perforaance  fleaaureaent  Notation 

aue  to  the  strict  hierarchical  structure  of  our  storage 
systea  and  the  deaaal  letch  pclicy,  ee  can  analyse  the 
pert oraance  of  the  aystea  by  separately  considering  too 
iatels  of  the  hierarchy  starting  eith  m.  since  .given 
i»f*l  only  receives  a  page  letch  request  it  the  inforaation 
has  not  been  found  it  a  higher  level,  each  level  usually 
s»e*  a  different  page  trace,  Ap» ,  apt,  *p>,  „tc. 

There  are  several  i.portant  properties  of  page  trace.. 

It  P  is  a  particular  pige  trace  (e.g.,  Ap» )  of  a  prograa,  ve 
Jef in»: 

•  |P|  length  ot  the  page  trace  sequence 
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*  1 1  set  of  distinct  pages  referenced  in  p 

•  131  nuaber  of  pages  in  u 

Por  oxaaple,  in  the  page  trace 

P  B  a,  b,  a,  c,  b ,  a 
«i  observe  that 
IPI  *  6 

U  *  (a,  c) 

131  *  3 

(Lover  case  letters  will  be  used  to  represent  logical  page 
aldresses  instead  of  page  nuabers). 

Por  a  specific  storage  hierarchy,  »e  define  J8|  to  be 
the  size  of  8  in  units  of  pages  receivable  froa  the  next 
lavac  level.  Por  exaaple,  |H‘|eS*/N*,  |(I*|=S*/N»,  etc. 

Por  a  specific  page  trace,  p,  storage  level,  8,  and 
reaoval  algoritha,  B,  ve  define  the  result  page  trace  or 
£113  Juice,  p '  *  as  the  tine  sequenced  page  references 
of  P  that  vere  not  found  in  8.  Me  shall  call  page 
referances  that  are  found  in  8  successes .  The  success 
&1&3&122L*  Sf,  is  the  nuaber  of  references  satisfied  by  8  and 
can  be  coaputed  as  |PJ-|P'|.  Dy  analogy  to  the  success 
function,  tne  nuaber  of  references  not  satisfied  by  8,  )P* 
is  called  tho  £aii^£e  £unc£±on,  Ft.  In  general,  we  wish  to 
aaxiaize  the  success  functiou  or,  equivalently,  aimaize  the 
lailure  function.  It  is  convenient  to  noraalize  the  failure 
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function  oy  defining  the  failure  frequency  function,  f, 

f  =  |  P  •  I  /|  P  | 

The  sgccgss  ffigiJiencjt  function,  s,  can  be  easily  ccmputeu  as 
1-f;  it  is  often  called  the  hit  rate  on  a  two-level  storage 
system.  Me  also  define  the  system  failure  frequency 
£!l££tionf  f<>,  of  a  level  to  be: 

*°  =  IP*  I/I  A| 

where  A  is  the  address  trace  generated  by  the  processor  and 
|i|  is  the  length  of  tae  address  trace  (it  is  also  true  that 
|A|  always  equals  |pi|,  thus  they  may  be  used 

interchangeably)  .  The  system  success  frequency  function  is 

correspondingly  defined  as  s°=1-f°. 

If  we  apply  the  definitions  above  to  the  processor 
generated  page  trace,  P*,  received  by  8>,  we  note  that  the 
result  page  trace,  P' ,  is  essentially  the  page  trace,  p«, 
received  by  H2.  There  is  a  minor  relabeling  required  to 

adjust  for  the  difference  in  page  size  used  by  H*, 
p2=P' (N*/N2) .  By  repeating  this  process  recursively,  we  can 
develop  the  effective  page  traces,  failure  and  success 

functions,  and  failure  and  success  frequency  functions  foe 
each  level  of  the  hierarchy.  Since  we  assume  that  all 
referenced  information  exists  in  the  storage  hierarchy,  tn 
sum  of  the  system  success  frequency  functions  must  be  1. 


3ne  general 


measure  of 


a 


storage  hierarchy':. 
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parforaance  is  its  effective  access  tiae,  T‘,  and  ettective 
cost,  C* ,  which  are  defined  as  fellows: 

T«  =  T»s0|♦r*s0*♦T3s0,♦. . . 

c*  =  (c»s»*:*s2*c3s3*.  ..)/(s»*s**s3*...) 

r*  and  C*  can  be  viewed  as  characterizing  the  entire  storage 
hierarchy  according  to  a  corresponding  one-level  systeo. 
Proa  a  cost/perf oraance  point  of  view,  cne  snoula  he 
iadifferent  between  a  single-level  single-technology  storage 
device  with  average  access  tine,  T',  and  average  cost/byte, 
Z*  ,  and  a  storage  hierarchy  systsa  with  perforaance 
paraaeters  (T',C).  1“  particular,  if  the  systea  designer 

needs  a  storage  perforaance  (T,C)  and  no  such  basic 
technology  exists,  ha  aust  atteapt  to  develop  a  storage 
hierarchy  such  that  (T*,C*)  =  (T,C) . 

3.3.2  Page  Trace  Siaulation 

3ne  way  to  deteraiue  the  success  frequency  function  and 
the  result  page  trace  f or  a  specific  page  trace  is  to 
sinulate  the  storage  aanageaent  algorithas  and  note  the 
contents  of  .1  at  each  step  of  the  page  trace.  Clearly, 
these  results  depend  upon  nunerous  paraaeters  (e.g., 
specific  trace,  raaoval  algoritha,  size  of  H,  etc.).  Piguro 
J  illustrates  this  step  by  step  siaulation  assuaing  deaand 
paging,  PIPO  (first-ia  first-out)  reaovai,  and  |H|  =  - 
pages.  Por  sinplicity,  the  page  trace.  If,  has  oeen 


Stonge  Hierarchy  Systens 


53 


(£3) 


•  P  =  a,  b ,  b,  c ,  b,  a,  d,  c,  a,  a 

•  | P  |  =  10 

•  3  =  (  a,  b,  c,  d  } 

•  IQ!  =4 

•  I  H|  =2 

•  FIFO  Beaoval 


Page  Trace, P  |a|b|b|c|b|a|d|c|a|aj 

- - - ♦ - ♦ - 

Fateh  1*1*,  ,  *  ,  |  *  i  *|*,*|  | 

“**""•  ••“•"**  **■“♦•**  -*---4--- 

H  Contents  |a|b|b|c|c|a|d|c|aja|  .-"new" 
(after  each  |  |a|a|b|b|cjajd|c|cj  <-"old" 
reference)  I  I  J  |  |  ,  |  j  |  ,  , 

— -♦ - ♦ *■ - ♦ - ♦ - ♦ - ♦ - ♦- — *- — + - f 

Page  Trace, P*  |  a|b|  |c|  |  a  |  d  |  c  |  a  |  | 


laaalts 

•  Ff  =  |P»  I  =  7 

•  Sf  =  |  P I  - 1  P  •  I  =  3 

•  f  =  70S 

•  s  =  30* 

•  P*  =  a,  b,  c,  a,  d,  c,  a 


Figure  3, 

Exaapie  of  Page  Trace  Simulation 
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normalized  to  be  exprassed  in  units  of  receivable  payes.  In 
particular,  if  M  is  fl*,  then  |M|=sl/N*  and  p=integer  (a/h 2) 
where  a  is  a  logical  address  reference  and  p  is 
corresponding  page  reference.  The  pages  in  fl  are  shown  as 
ordered  to  indicate  the  FIFO  ordering,  the  top  page  is  the 
"last"  (••latest")  page  fetched  into  M,  whereas  the  bottom 
paye  is  the  "first"  ("oldest")  page  in  fl  and  is  the  page 
selected  for  replaceaant  when  necessary.  The  asterisk  (*) 
indicates  that  a  fetch  was  reguired  frcm  a  lower  level  of 
the  hierarchy,  the  page  reference  is  thus  noted  as  part  of 
the  result  page  trace,  P1. 

It  is  normally  assumed  that  all  levels,  except  level  L, 
ace  empty  initially,  thus  there  is  a  transient  stage  during 
which  pages  are  loaded  into  fl  without  any  replacements 
needed.  Since  there  are  so  few  pages  in  M  during  this 
start-up  stage,  there  are  many  fetches  reguired.  He  will 
find  it  useful  to  separate  out  this  transient  phenomenon. 
This  transient  consists  of  the  page  trace  up  to  the  first 
1 .1 1  unique  page  references,  in  the  example  of  Figure  3  this 
is  tha  first  2  page  references  (i.e.,  a,  b) •  Consider  the 
case  if  |Q|<|fl|,  there  would  be  no  further  fetches  into  this 
iavel  after  the  initial  transient  that  loads  the  |Q|  pages 
into  fl.  In  this  case,  |P'|=IUI  exactly,  independent  of  |P|, 
and  s  tends  toward  1  as  J P |  increases. 
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In  the  particular  example  illustrated  in  rigure  j,  we 
note  that  there  were  3  'hits'  and  7  'misses'  out  or  1C  page 
references,  so  that  s*jOfc.  Thus,  P*  only  consists  of  7  page 
raferences  to  the  lower  levels* 

3.4  Halted  a22S4££& 

Kb  noted  above,  we  wish  .  to  develop  a  storage  hierarchy 
with  attractive  cost/perforsance ,  (C',T») ,  characteristics. 
It  is  clear  that  we  can  arbitrarily  decrease  the  cost/byte 
oy  sating  the  size  of  each  level,  S,  increasingly  larger  as 


we  go  from 

the 

high -  per foraance  high-cost  to 

the 

lov-perf oriance 

low- 

cost  levels 

(i.e.,  C»>C*>C*>... 

and 

S*<S*<S*<...|  . 

In 

fact,  this 

approach  is  the 

basic 

motivation  for  storage  hierarchies, 

» 

Unfortunately,  if  the  processor  generated  address 
references  that  were  uniformly  distributed  in  tiae  and 
address,  each  byte  would  be  equally  likely  to  oe  referenced 
at  any  instant.  This  probability  would  be: 

Pr[  referance  aj  =  ..) 

Thus,  the  expected  systea  success  function,  s°,  for  eacf- 
level  is  proportional  to  the  size  of  the  level.  For  example, 

s°‘  =  S‘/(S‘'S*'S»'...|. 

dut,  since  we  have  assumed  that  S*<S*<SJ<. . . ,  we  find  that. 
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j9 !<«•*<(• »<.. .  Thus,  thu  systes  success  t auction  fur  tnw 
Lth  lsvel  doaiaates  (i.e.,  is  approxiaatel y  1)  since  we  nave 
assuaaJ  that  it  is  tba  largest  level.  Deferring  back  to  oar 
dsfinition  of  effective  access  tiao,  we  find  tnat  T*  would 
o»  approxiaately  equal  to  the  lowest  perforaatice  level 
(level  L)  since  all  tho  other  tetas  would  be  negligible.  it 
this  analysis  were  true,  our  storage  hierarchy  woulu  result 
io  a  perforaance  just  slightly  better  than  our  lowest 
pertoraance  level  at  a  noderate  increase  m  price  -  not  an 
especially  exciting  result.  Fortunately,  actual  storage 
hierarchies  do  not  be  have  tms  way*  He  will  briefly  review 
soae  related  research  on  this  subject. 

3. 4. 1  Locality 

It  has  been  eapirically  observed  that  actual  prograas 
cluster  their  references  so  that,  during  any  interval  of 
tiaa,  only  a  subset  of  the  inforaation  available  is  actually 
used.  A  detailed  discussion  of  this  phenoaenon  will  be 
preseated  io  the  thesis. 

It  is  inportant  to  note  tnat  due  to  our  basic  rankings 
of  page  sixes  and  access  tiaes  in  the  storage  hierarchy, 
aicn  level  "sees"  a  different  view  of  the  prograa.  The  high 
lavels  of  the  hierarchy  aust  fellow  the  aicrcscopic 
instruction  by  instruction  reference  pattern  whereas  the 
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aiddla  levels  follow  a  aore  gross  subroutine  by  subroutine 
pattern*  The  very  low  levels  are  priaarily  concerned  about 
tne  processor's  references  as  it  aoves  iron  subsystoa  to 
subsystea.  Je  do  not  have  any  a  prion  guaiantee  that 
locality  of  reference  nolds  equally  true  for  ail  of  muse 
views*  but  we  do  ba va  soao  reported  evidence  to  encourage 
•  No  st  ot  these  stud  ids  have  been  based  upon  twc- level 
storage  systeas  or  restricted  foras  of  three-level 

niararchies. 

3.4.2  Paging  Systeas 

The  earliest  automatic  storaqe  systeas  were  based  upon 
two-level  core-drua  hierarchies  (devices  2  and  4  of  Table 
1) .  This  technique  was  introduced  in  the  Atlas  systea 
[3b,57]  luring  the  early  1960's.  It  has  since  been  used  on 
aany  conteaporary  systeas. 

The  perforaance  ot  paqing  systeis  has  been  studied  oy 
various  researchers,  such  as  Belady  [12],  Coffaau  and  Vanan 
[19,66],  Hatfield  [46],  and  Sayre  [77],  In  Coftaan's 
results,  for  exiapla,  it  was  noted  that  even  though 

S»/(S» ts*) =5.25,  s'  otten  exceeded  95*.  Hatfield  studied 
the  parforaance  ot  systea  prograas  that  had  been  carefully 
dasigaed  and  found  that  for  S*/(S'*S*)  ratios  as  low 
3.25,  it  was  possioie  tor  s*  to  etten  exceed  99.99». 
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3*4.3  Cache  Systeas 

Cache  systeas  ire  based  upon  two-level  cache-main 
hierarchies  (devices  1  and  2  of  Table  1).  Although  they  nave 
baen  proposed  as  early  as  1965  (see  Hilkes  [88]),  tne  aajoc 
coaaerciai  use  of  cache  systeas  did  not  occur  until  the 
introduction  of  the  IBM  Systea/360  Model  b5  £21, bl].  More 
recently,  this  tecnnique  has  been  used  in  several 
conteaporary  systeas,  such  as  the  IBM  Systea/370  Model  155 
and  Model  165  [52]. 

In  these  cache  systeas,  IBM  found  that  it  eas  possible 
to  drastically  reduce  S»/(Sl*S*)  to  as  low  as  IS  and  still 
keep  the  hit  ratio,  s1,  above  90S.  Siailar  findings  were 
also  reported  by  dell  and  Casasent  [13],  Mattson  [64],  Meade 
[55],  and  Seligaan  [78]. 

3.4,4  Three-level  Systeas 

There  have  bean  a  few  three-level  systeas  reported  in 
the  literature,  unfortunately  they  have  all  been  soaewhat  arl 
£22  tn  design  and  the  results  are  far  froa  conclusive. 
There  have  been  at  laast  three  types  of  such  hierarchies 


st  udiad 
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3.4.4.  1  Nain-Bulk-Nass  Store  Hierarchy 

Thera  have  been  several  systems  devised  based  upon 
iavicas  2,  3,  and  5  of  Table  1.  The  Bali  store  actually 
used,  called  Large  Core  Store  (LCS) ,  had  a  much  lower  access 
tiae  (around  d  us)  ani  a  much  higher  price  (about  25m/byte). 
Ia  order  to  coapensite  for  peculiarties  in  the  hardware 
structure  and  out  of  considerable  concern  for  the  extreme 
cost  of  LCS,  these  systeas  tended  to  become  much  more 
annually  managed  hierarchies  than  automatically  managed. 
Although  they  were  found  to  be  effective,  it  is  difficult  to 
ganenlize  the  results.  The  aost  ambitious  attempt  reported 
was  undertaken  by  Camagie-Nellon  University  [36J.  Results 

have  also  been  reported  by  Durae  [31],  Hilliams  (.89],  and 
ot  hers . 

Hain-Larga-Mass  store  Hierarchy 

There  does  not  appear  to  be  any  automatically  managed 
systems  of  this  type  published  in  the  general  literature. 
The  aultics  system  at  NIT  Eroject  MAC  nas  recently 
introduced  a  "page-multilevel”  strategy  based  upon  device 
2,  4,  and  5  of  Taole  1.  There  has  only  been  limited  finding 
reported  to  date  but  it  has  been  stated  in  the  Naich  1972 
issue  ot  tne  NIT  Information  Processing  Services  B^lletM 
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(p.  11)  that  it 

"...  does  pay  off  since  it  neets  fluctuating  denanus 
on  the  systen,  reduces  the  workload  for  the  disks  to 
aQ  efficient  level,  is  inexpensive,  and  keeps  pages 
on  the  drun  for  an  acceptable  length  of  tine." 

As  an  indication  of  its  effect,  the  nee  strategy  is  reputed 
to  have  iucreased  the  success  frequency  function,  s2,  of  the 
lean  fron  20*  to  lore  than  90*  (i.e.,  "reduced  fron  one  page 
raai  fron  the  disk  for  every  four  reads  fron  the  drun,  to 
one  page  read  fron  the  disk  for  every  ten  to  twenty  pages 
fron  the  drun") . 

3.4.4. 3  Hain-Large-Giant  Store  Hierarchy 

The  work  of  Considine  and  Weis  [20]  is  difficult  to 
categorize.  It  is  based  upon  a  three-level  hierarchy  where 
the  first  level  corresponds  to  device  2  (nain  store)  of 
Table  1,  the  second  level  corresponds  to  a  conbinatiou  of 
devices  4  (druns)  and  5  (disks)  ,  and  the  thira  level 
consists  of  renovable  disks  which  can  best  be  approxmated 
oy  device  6  of  Table  1.  It  is  inpossible  to  compute  any 
success  frequency  functions  fron  their  data,  but  it  appears 
that  for  S2/(S2*S3) =0. 5,  s2  is  very  high.  They  note 
(p.443),  in  particular,  "nost  of  the  data  noved  to  the 
archival  storage  (i.e.. 


n3)  have  stayed  there." 
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i*4.S  Heed  for  Additioual  research 

Although  the  reaults  of  research  described  atov*  1* 
encouraging,  the  design  and  petforaance  ot  genutja 
siltipie-level  storage  hierarchies  are  still  loccnclusi ve. 
This  thesis  is  ioteodel  to  provide  specific  roaults  in  this 


area . 
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CrUPTE*  «. 

A  STOWAGE  MIEfcARCHf  SYSTE1 

4.0  lai£24ll£il2fl 

In  this  chapter  4  design  toe  *  jinciui  sultiple  level 
Jtorage  hierarchy  systus,  in  particular  with  tat ee  or  sore 
ltvels,  is  presented.  fain  design  is  Lesea  upon  an  orderly 
and  unitors  treatswnt  ot  th«*  logical  Jtructura  ot  tm 
storage  level*  and  their  in torccnuections.  in  addition  to 
providing  a  solution  to  convenient  storage  aanagesent  ter 
toe  ustr,  this  design  is  intended  to  produce  gooJ 
parforsance  tor  the  storage  hierarchy  as  aeasured  ty  its 
at  fact iv«  accsss  tisa,  T* ,  and  effective  cost,  C*.  Tne 
principle  and  novel  tuchnigues  to  be  used  aro  described 
separately  in  the  suctions  below. 

*»•  1  S20UQ1I2JI*  QA5k4t2ai 

As  noted  earlier,  automatic  storage  Hierarchy  systems 
»c«  still  in  the  sinonty.  Aaongst  those  systeae  tnat  c> 
provile  autositic  storage  hierarchy  cinageaent,  the  sajonty 
lisit  their  scopo  to  two  levels  with  a  few  rare  thceo  level 
^ystass.  As  a  result  ot  these  lititatior.s,  the  uset  u 
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1  toccoi  t0  r*1*  on  *a»uai  or  aer 1-autoaa tic  ^tcia^o 
-ina^.ont  technique*  to  deal  with  the  ototd,*  i.vuU  tnat 
»c«  oat  autosatically  aaua^od.  Thus,  an  automatic  storage 
a»ua j-taaut  ayataa  should  constat  or  a  HUU&kl 
tiat  anco.paaaea  tha  tali  ranjc  cl  storage  levels. 


4,1,1  t-'o*tA>arlor»anca  ot  Adjacent  Levels 


*  ■ajar  obstacle  to  generalizing  storage  sanagcaent 
Ujontu.a.  In  paiti:aldc  in  two-level  paging  syste.s,  AS 
tha  treaen Jous  contrast,  oltan  over  3  orders  ot  .agnitaue, 
13  cost/p«rf orsanco  boteeen  rt*  and  hi.  kii  aiustc<ted  lo 
Tibia  1  (pa, a  2«| #  a  representative  Hain  Store,  h»,  has  uu 
accesi  tiae  Of  l.aa  us  coapared  to  a  Large  Store,  H*,  *xth 


da  access  tin  of  5  as.  m  auen  a  two-level  ayste.,  the 
affective  access  tiso,  f»,  is 


r*  ■  r*s°>  ♦  fd»od 


r*  ■  1.44a01  ♦  5000s°* 

4»d  sine.  «•*.!.  «  can  .ubsututo  s»i.|-a.a  to  „,t 
T*  ■  1.44  -  1.44*0*  ♦  5C00*°4 
T*  ■  1.44  ♦  499b • 56s°* 

in  crier  to  attain  aa  ettectivc  access  tiae,  T»,  that  is 
cos, .arable  to  the  .lain  Store  access  tire,  r>,  wo  sust  xoe 
tie  i/itea  iuccess  tra,uency  lunction,  s0*,  very  close  to  u 
«,  correspon lingiy ,  <oep  a°»  very  clcse  to  1.  *ven  with 

<»»  laprovosent  to  99. gi  eoulu  cut  the 
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■it'foctive  access  time.  T*.  ia  halt.  With  such  pressure  to 
ittain  very  high  s°*  values,  the  systems  designer  is  otten 
t'orcai  to  seek  out  vary  specialized  techniques  in  contrast 
to  our  goals  of  orderly  and  uuilorui  algorithms. 

4.1.2  Moderate  Uost/Parf ormauce  Ratios 

£n  order  to  maJco  the  storage  hierarchy  design  ronust 
iad  flexible,  the  cost/performance  characteristics  should 
differ  by  less  than  two  orders  of  magnitude  between  adjacent 
levels.  Thus,  success  frequency  functions  in  the  range  90* 
to  994  are  adequate  to  insure  reasonable  performance.  If 
the  differences  are  much  greater,  it  will  be  difficult  to 
tLnl  sufficiently  efficient  general  algorithms.  Since  minor 
changas  in  production  techniques  and  technology  evolution 
can  result  in  a  variation  of  a  factor  of  two  or  three  in  the 
cost/perf ormance  for  a  given  technology,  it  is  not  desirable 
to  decrease  much  oelow  one  order  of  magnitude  difference 
bstwean  adjacent  stonge  levels. 

4.2  £kad2w  i ai  Ease  SBUliiaa 


The  time.  Tm. 

required  tc 

move  a  page 

between 

t  wo 

ravels  of  the  hierarchy  usually 

consists  of 

summing 

two 

components:  (1)  the 

average  access  time.  T. 

and  (2) 

the 

tcanscer  time.  BxN . 
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IE  all  page  sizes  wore  set  to  provide  exact!/  tlie 
aaouut  of  infor.ation,  requested  b,  the  processor,  the 
paye  movement  time  would  be 

Tm  =  T  +  BxN» 

where  T  and  8  would  depend  upon  the  particular  storage 
levels.  By  examining  the  representative  devices  shown  in 
Table  1  (page  28),  we  see  that  access  time  varies  much  more 
than  transfer  rate  (i.e.,  access  time  spans  6  orders  or 

magnitude  whereas  transfer  rate  varies  by  only  3  orders  of 
magnitude)  . 

4,2,1  Mdrgiaal  Increase  in  Page  Transfer  Time  and  Reference 
Probability 

Let  us  assume  that  N»  is  guite  small,  such  as  d  bytes. 
-a  Can  aslc  the  question:  What  is  the  marginal  increase  in  T» 
if  we  transfer  the  adjacent  N»  bytes  in  addition  to  tne  m 
hytes  reguested  by  the  processor?  Table  3  on  the  next  page 
answers  this  guestiou.  Notice  that  the  marginal  increase  in 
la  decreases  from  a  high  ol  5.5*  (level  2  to  level  1)  to  a 
law  of  .002*  (level  t>  to  level  5).  Tills  fact  is  only 
interesting  if  we  also  consider  the  concept  of  locality  (see 
-uapt-rs  j  ind  o  tor  additional  discussion)  and  the 
iuestion:  what  as  the  probability,  Fr,  that  the  processor 
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(It  J) 


Level 

Transf er 

Tu 

(1  unit) 

Tm 

(2  units) 

Marginal 

ill  T oi 

2  to  1  (*) 

1.44  us 

1.52 

us 

o.  55. 

3  to  2 

131  us 

132 

US 

.  Mft, 

4  to  3 

5006  u 5 

50  11 

US 

.  H 

5  to  4 

38010  us 

3602C 

US 

.0  3% 

h  to  5 

600013  us 

60  0027 

US 

.002* 

Taule  3. 

Marginal  Increase  in  Page  Transfer  Times 


*  Tha  figures  for  access  time  and  transfer  rate  for  the 
lain  Stora  listed  in  Table  1  are  approximations  that  are 
only  meaningful  for  very  large  page  sizes.  For  the  page 
sizes  under  consideration  in  this  chapter,  the  figures  uses 
in  tna  table  aoove  are  mote  appropriate. 
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will  reference  the  adjacent  N»  bytes  with  a  sucrt  interval 
of  time,  such  as  I'm  seconds?  Dae  tc  locality  of  program 
reference,  we  would  expect  Pr  to  be  much  larger  than  merely 
tne  reciprocal  of  the  logical  address  .pace  sue, 
Furthermore,  Pr  should  increase  as  Tm  increases.  Thus,  for 
a  yiven  level,  if  Pc  is  laryer  than  the  marginal  increase  in 
i’’a'  it  is  beneficial  to  transfer  the  additional  hi  oytes  aim 
taeceby  avoid  tho  necessity  of  expending  Tm  seconds  to 
transfer  these  N>  bytes  later  separately. 

These  same  arguments  can  be  applied  to  tne  yuesticn  01 
transferring  the  adjacent  nxh  i  bytes,  etc.  since  the 
marginal  increase  iu  Tm  decreases  monotomcally  as  a 
function  or  storage  level,  the  number  of  N»  byte  packets  to 
be  transferred  as  a  single  page  should  increase 
monotonically ,  This  confirms  cur  earlier  decision  that 
h»<N*<N3<  etc. 


4.2.2  Choice  of  Page  Size 

Tn  order  to  simplify  the  implementation  of  the  system 
and  to  be  consistent  with  the  mapping  from  logical  audress 
to  page  address  illustrated  in  Figure  2  (page  46),  we  will 
raguice  that  all  page  sizes  be  a  power  of  two.  Thus,  each 
page  size  (e.g.,  N  •*)  is  some  pcwer  cf  two  larger  than  the 
pije  size  of  the  next  higher  level  (e.g,,  . 
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-lead  y ,  the  specific  values  of  Pr  and  thus  the  choice  tor 
aach  page  size  depenis  upon  the  characteristics  or  tne 
programs  to  be  run  and  the  effectiveness  or  the  cveiaii 
storage  system.  Preliminary  measurements  indicate  that  a 
ratio  of  4:1  between  levels  is  reasonable,  Heaue  (.65]  nas 
reported  similar  findings.  Other  important  factors 
affecting  page  size  are  discussed  in  Chapters  5  and  t. 

4.2.3  Page  Splitting 

Mow  let  us  consider  tin  actual  movement  of  information 
in  the  storage  hierarchy.  At  time  t,  tne  processor 
generates  a  reference  tor  Logical  address  a.  Assume  that 
the  corresponding  information  is  not  currently  stored  in  a» 
or  8*  but  is  found  in  S*.  For  simplicity  assume  that  page 
sizes  are  do  tbled  as  we  go  oown  the  hierarchy  (e.g,,  N*=2Hl, 
NJ  =2N*  =4M* ,  etc.;  sea  Figure  4).  The  >age  of  size  N* 
containing  a  is  copied  iron  M*  to  8*.  a *  uow  contains  the 
needed  information,  so  we  i  epeat  the  process.  The  page  or 
size  N2  containing  a  is  cop..ed  from  a*  to  ji.  Now,  finally, 
the  page  of  size  Nl  containing  a  is  copied  trom  a*  tuu 
forwarded  to  the  processor.  In  this  process  the  page  of 
information  is  split  (i.e.,  £age  spjlittina)  repeatedly  as  it 
moves  up  the  hierarchy. 
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4.2.4  Shadow  Storage 

is  a  result  o£  this  splitting,  the  page  ot  size  N*  that 
is  received  by  the  processor  has  left  a  "shadow"  consisting 
of  itself  and  its  adjacent  pages  behind  in  all  the  lower 
levels  (i.e.,  S^ow  £iO£age)  .  PresuaaDly,  if  the  prcgraa 
exhibits  locality  ot  reference,  aany  of  these  shadow  pages 
will  be  referenced  shortly  afterward  and  be  aoved  further  up 
in  the  hierarchy  also. 

4.2.5  Copying  of  Pages 

In  the  strategy  presented,  pages  are  actually  copied  as 
they  nove  up  the  hierarchy;  a  page  at  level  n  has  one  copy 
of  itself  in  each  of  the  lower  levels.  Since  processor 
"fetch"  reguests  substantially  outnuaber  "store"  reguests 
(a.g.,  by  sore  than  5:1  in  soae  aeasured  progress) ,  the 
contents  of  pages  are  seldoa  changed.  Thus,  if  a  page  has 
not  baan  changed  and  is  ^elected  to  be  reioved  frca  one 
level  to  a  lover  level,  it  need  not  be  actually  transferred 
since  a  valid  copy  already  exists  in  the  lower  level.  The 
contents  of  any  level  of  the  hierarchy  is  aivays  a  subset  of 
the  information  contained  in  the  next  lower  level.  Tins, 
tie  total  intoraation  capacity  ot  the  systen  is  egual  to  the 
size  ot  the  level  L  store  rather  than  the  sun  c £  the 
capacities  ot  ail  the  levels.  Since  the  capacity  ot  level  L 
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is  assuaed  to  be  auch  laryer  than  the  capacity  ot  L- 1 ,  etc., 
the  difference  in  totai  systea  capacity  aue  to  shauo* 
storage  is  ainiaal. 

4*  3  &iLS£t  itiaslst 

In  the  description  above  it  is  laplied  that  infoiaation 
actually  aoves  between  adjacent  levels.  This  approach, 
called  <|i£8£t  i£Ui§£a£,  is  indeed  intended,  dy  ccapanson, 
taough,  aany  proposed  and  experiaentai  aultiple  livel 
storage  systeas  are  based  upon  an  indirect  i£ac§£g£  (e.g., 
the  aultics  "page  aultilevel"  systea  aentioned  in  Chapter 
In  these  systeas,  all  inforaation  is  routed  through 
level  1.  For  exaaple,  to  aove  a  page  troa  level  n  to  level 
n-1,  the  page  is  aoved  froa  level  n  tc  level  1  and  then  iron 
level  1  to  level  n-1.  Clearly,  this  indirect  approach  is 
undesirable  since  it  requires  extra  page  aoveaent  and 

consuaes  a  portion  of  the  liaited  a»  capacity  in  the 
process. 

There  have  been  two  aajor  obstacles  to  direct  transfer 
in  previous  systeas:  (1)  interconnection  structure  and  („’) 
sync hr  on ization. 
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4#  3. 1  Interconnection  Structure 

Por  nan  y  reasons,  so  ae  technical  and  soae  historical, 
most  conteaporary  systeas  are  physically  structured  in  a 
radial  aanner.  That  is,  there  is  a  central  eiemeut  to  the 
s/stea,  either  the  processor  itself  cr  tne  pnaary  store, 
and  all  other  storage  devices  and/or  processors  are  directly 
connected  to  this  central  elenent.  Except  for  soae  possible 
control  signals,  thare  are  no  direct  data  trausfer 
connections  between  the  non-central  eleaents.  This 
structure  is,  of  course,  quite  consistent  with  a 
non-feier arcnical  storage  aanageaent  systea.  k  logical 
storage  hierarchy  systea  should  be  based  upon  a  physically 
hierarchical  interconnection  structure. 

4.3.2  Synchronization 

is  indicated  in  Table  1,  storage  devices  often  have 
different  tiaing  and  transfer  rate  characteristics.  In  order 
to  accoaplish  a  direct  data  transfer  between  levels, 
synchronization  is  necessary.  It  nay  be  obvious  that  a 
storage  device  can  not  transfer  data  faster  than  its  rated 
perforaance,  but  for  aany  storage  devices,  especially 
electroaechanical  devices,  it  is  not  possible  to  transfer 
data  slower  than  its  nted  speed. 
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Based  oo  current  technology,  this  problem  can  b« 
solve! .  Haoy  of  the  storage  devices  art  now 
uon- electromechanical  (i.e.,  strictly  electrical),  sucu  as 
tae  Cache,  Main,  and  Bulk  Stores  of  Table  1.  it  is  4uite 
faasiole  to  provide  direct  transfer  between  any  or  tuese 
devices  and  any  other  storage  device;  this  is  one  reason  tor 
the  radial  interconnections  described  above  where  the  Main 
atore  acted  as  the  coaaon  aeans  of  providing 
synchronization.  Using  a  similar  approach,  we  can  allow 
direct  transfer  between  electromechanical  devices  if  this 
transfer  is  routed  through  a  small  and  reasonably 
inexpensive  electrical  storage  buffer.  Pemling  [J3j 
discusses  such  a  devise,  which  he  calls  a  rubber-band  «emory 
presumably  because  it  -stretches"  to  match  the 
characteristics  of  tha  source  and  destination  devices. 

4.4  Bead  Through 

In  the  description  above,  it  is  implied  that  a  transter 
up  the  hierarchy  from  level  2  to  the  processor  (level  0) 
consists  or  two  seyueatial  steps:  (1)  transfer  page  ot  sne 
N*  from  level  2  to  level  1,  and  then  (2)  extract  /><> 
appropriate  page  subset  of  size  N*  and  transfer  it  u 
leval  1  to  the  processor  (level  0).  In  general,  a  transfer 
fcon  level  n  to  the  processor  would  consist  of  a  senes  or  u 
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«eps.  Thus  the  syste.  page  transfer  ties  .ould  equal  t„. 
°f  "  page  transfer  ti.es 


ri»»*r**j*ra->** 


F  urtheraor e. 


many 


electr  oaechaaical  storage  devices,  the  second  accss. 
required  t.  f.r.ard  the  pace  subset.  say  erperruncr  the 
".ati.u.e  access  delay  rather  than  the  "average" 
efter  storing  the  inforhatiou  into  the  level,  a  replete 
aacbauical  revolution  eay  te  required  to  reposrtron  to  read 
the  sase  inforaa tion  and  for.ard  it  to  the  nert  level,. 

Phis  inefficiency  ca.  be  a.oid.d  hy  allo.in, 
inforaation  to  ha  stored  into  all  „pp.c  taf.u 

siaultaneously.  figur.  ,  illustrates  this  eechanisa.  If 
inforaation  is  to  ba  transferred  fro.  «J  to  the  processor, 
*’  turns  on  its  output  data  gate,  0>out.  .hen  it  is  ready  to 
atart  and  transfers  N*  hyt.s  and  their  corresponding  logical 
addresses  up  the  data  bus.  «a  turns  on  its  input  data  gate. 

J,l“.  C8Cai*e  “>•«  »'  bytes,  furthernore.  .hen  the 

appropriate  ..  bytes  needed  hy  h>  are  detected  hy  .a,  it 

turns  on  its  output  data  gate.  Cost,  and  these  ..  bytes  are 
f or.arded  to  a.  .bile  being  stored  in  etc. 


Por  eraaple,  assu.e  a  reference  tc  logical  address  a  is 
generated  hy  the  processor  and  the  corresponding  inforaation 
is  current  stored  at  level  n  (and  all  lo.er  levels,  of 

instint  th-at  the  N»  bytes  containny  a  are 


caucse) .  the 


« 
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placal  on  the  data  bis  by  level  n,  these  M  bytes  mil  be 
store!  into  ill  levals  froa  level  n- 1  to  level  o  (the 
processor)  siaultaneoualy.  Likewise,  the  M*  o/t«» 

containing  a  are  siaultaneousl  y  stored  into  all  levels  iro# 
laval  n-l  to  laval  1.  Thia  strategy  thua  sates  it  appear 
taat  the  b»  byte  page  requested  by  tha  proceasor  is 
directly  to  the  processor  without  any  delays. 

4.4.1  Page  Transfer  Tiaa 

Jsing  the  read  through  strategy,  the  page  transfer  tiae 
to  tbs  processor  is  actually  lass  than  tha  page  transfer 
tiaa  to  tha  adjacent  storage  level,  Por  exaaple,  if  the 
requested  iuforaation  is  stored  in  A*,  the  page  transfer 
tiae  to  tha  processor,  via  read  through,  is 

fajo  *  xJ  ♦  M 1 B* 

whams,  the  page  transfer  tiae  fcca  A*  to  A*  is 

TiJ*  -  T»  ♦  N»B*. 

Since  N  *  <N  * ,  then  TeJ®<Te**. 

4.4.2  Availability  ani  Serv icabi lity 

The  read  through  auchamse  described  above  otitis  jobc 
iiportant  advantages  to  the  availability  and  serviceability 
of  tha  storage  systea.  Note  that  all  storage  levels  are 
connected  to  the  gatei  data  bus  not  directly  to  each  otner. 
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“  4  3tora'*«  «“-t  be  reacted  froa  tne  systen  fox 

-ar/lcing,  it  la  aerely  necessary  tc  aanually  set  sotn  uin 
*  >ut  .  In  this  mao,  the  intonation  is  xeaii) 
"re»3  throujh-  this  lavel  as  if  it  oidn't  exist.  uo  cth*r 
caao^aa  are  needed  to  iny  of  the  other  storage  levels  or  tne 
storage  amageaent  algorithau  although  wo  would  ei^ect  the 
ptrforaance  to  decrease. 

4*5  SL2E2  filkiQi 

Under  ooraal  steady-state  operation,  ail  the  levels  ox 
taa  storage  hierarchy  till  be  full  (except  possibly  level 
LJ.  Thus,  whenever  a  page  u  tc  bo  aoved  xnto  a  level,  xt 
is  necessary  to  reaove  a  current  page,  it  the  page  selected 
far  reaovsl  has  not  been  changed  by  aeans  of  a  processor 
-store-,  the  new  page  can  be  iaaediately  stored  into  the 
laval  since  a  copy  of  the  reaoved  page  already  exists  in  the 
next  lower  level  or  the  hierarchy.  if  the  processor 
ganorates  a  -store-  raguest,  all  levels  that  contain  a  copy 
n  taa  mtoraation  being  aodified  aust  to  updated.  This  can 
tn  accoaplished  in  three  basic  ways:  (1)  store  through,  (2) 
atoro  replaceaent,  or  (j)  store  fcehind. 
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4.S.1  Store  Through 

Under  a  s£o£§  th^oi^h  policy,  ail  levels  are 
simultaneously  updated  whenever  the  processor  generates  a 
"store"  reguest.  This  is  the  obvious  inverse  ot  the  reau 
tarough  policy.  Hut,  there  is  a  crucial  distinction.  Under 
real  through,  only  storage  levels  1  tnrough  n  are  used, 
whora  n  is  the  highost  level  containing  the  requested 
information.  Store  tnrougn  must  update  the  coutents  of 
levels  n  through  L.  fnus,  read  through  speed  is  limited  by 
its  slowest  level  affected,  level  n;  store  through  is  always 
limited  by  tha  speed  ou  level  L,  the  slowest  level  of  them 
all.  If  2)*  of  all  processor  reguests  are  "stores",  the 
system  success  frequency  function  ot  level  L  will  be  at 
least  20*.  Due  to  its  large  average  access  time,  level  L 
will  be  the  dominate  portion  ot  the  system's  effective 
access  time,  T*. 

Store  through  can  be  used  efficiently  only  if  the 
access  time  of  level  L  is  comparable  to  the  access  time  of 
lavdl  1,  such  as  in  a  two-level  cache  system.  In  fact,  it 
is  used  in  some  cache  systems,  such  as  the  IBM  System/370 
Models  155  and  165  (.  52  J. 
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4.5.2  Store  Replacement 

Under  a  store  replacement  policy,  the  processor  only 
stores  into  N 1 .  Ir  a  changed  page  is  later  selected  tor 
removal,  it  is  then  moved  to  the  next  lower  level,  M 2 , 
iimediately  prior  to  being  replaced.  This  process  occurs  at 
every  level  and,  eventually,  level  L  will  be  updated  but 
only  after  the  page  has  been  selected  tor  removal  irorn  all 
the  higher  levels.  Due  to  the  extra  delays  caused  by 
updating  changed  pages  oetore  replacement,  the  effective 
access  time  for  fetches  is  increased.  Various  versions  of 
store  replacement  are  used  in  most  two-level  paging  systems 
since  it  offers  substantially  better  performance  than  store 
through  for  slow  second  level  storage  devices  (e.g.,  drums 
and  disks) . 

4.5.3  Store  Behind 

Store  Behind  is  a  compromise  strategy  that  bridges  the 
gap  between  store  through  and  store  replacement  and  oilers 
substantially  better  pertorinance.  In  both  strategies  above, 
the  storage  system  was  required  tc  perform  the  update 
operation  at  some  specific  time  (e.g.,  at  the  instant  of  n> 
"store"  request  for  store  through  or  at  tne  instant  or 
removal  for  store  replacement) •  Once  the  information  to  be 
stored  has  been  accepted  by  the  storage  management  system. 
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ttie  processor  doesn't  really  care  hew  or  when  the  copies  in 
the  storage  hierarchy  are  updated.  Store  benind  takes 
advantage  of  this  dajree  of  freedom.  due  to  tne  large 
disparity  between  average  access  time  and  transfer  rate  tor- 
most  levels,  the  maximum  data  transfer  capacity  is  rarely 
reached  (i.e.,  at  any  instant  of  time,  a  storage  level  may 
not  have  any  outstanding  reguests  for  service  or  it  may  be 
waiting  for  proper  positioning  to  service  a  pending 
request).  During  these  "idle"  periods,  data  can  be 
transferred  down  to  the  next  level  of  the  storage  hierarchy 
without  affecting  or  delaying  any  fetch  operation.  Since 
tnese  "idle"  periods  are  usually  very  frequent  under  most 
actual  circ instances,  there  can  be  a  continual  flow  of 
cnanged  information  down  through  the  hierarchy  towards  level 
L. 


4.6  Automatic  Management 

Although  an  effective  storage  management  system  should 
attempt  to  minimize  page  movement  and  its  associated 
"housekeeping",  tnere  will  still  be  a  substantial  amount  of 
worx  required  to  manage  the  hierarchy.  It  is  desirable  to 
remove  as  much  as  possible  of  the  storage  management  ft om 
tne  concern  of  the  processor  and  the  programs  running  on  tne 
processor,  including  the  operating  system.  There  are  two 
primacy  motivations  for  this  objective:  (1)  the 


storage 
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hierarchy  saould  function  an  an  independent  component  or  the 

*ystei  to  elimiaato  any  added  complexity  to  tne  processor  or 

pcojraas,  and  (2)  we  want  to  conserve  tne  processor's 

computational  powers  for  solving  tne  user's  problems  turner 

taan  tor  "systea  overhead".  In  actuality,  of  course,  the 

storage  hierarchy  can  not  be  divorced  entirely  from  the  rest 

3f  tne  systea,  but  the  remaining  interdependencies  shoulu  ce 
minimal. 

h.6.  1  Distributed  Control 

In  the  hierarchical  storage  system  described  above,  all 
storage  aanageaent  operations  can  be  determined  local  to  a 
single  level  or,  at  most,  in  consideration  of  information 
ccoi  neighboring  levels.  Thus,  it  is  possible  to  distribute 
tne  control  of  the  aierarchy  into  the  levels,  this  also 

facilitates  parallel  and  asynchronous  operation  in  the 
hierarchy, 

In  a  coaprenensive  multiple  level  storage  hierarchy,  as 
illustrated  in  Table  1,  this  automatic  and  distributed 
control  can  be  accomplished  by  using  two  mecnamsms:  (1) 
processor  functions,  ml  (2)  "intelligent"  controllers. 
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'4.0. 1.1  Processor  Punctions 

The  management  of  the  first  storage  level  must  cerate 
at  speeds  comparable  to  the  processor.  As  a  result,  it  ir, 
usually  necessary  to  incorporate  the  first  level  stoLe  auu 
its  associated  management  operations  into*~*Che  processor 
hardware  itself.  This  approach  is  used  in  the  idM 

Sy  ste»/3  70  cache  systems  [b2j. 

It  is  often  desirable  to  incorporate  the  management  of 

* 

the  second  storage  level  also  into  the  processor,  Tnis 
lavel  requires  substantial  performance  to  handle  the  demands 
for  service  from  the  first  storage  level.  Since  its 

reguirements  are  not  guite  as  demanding  as  the  first  level, 
it  is  an  ideal  candidate  tor  firmware  control,  assuming  that 
tne  processor  is  microprogrammed.  This  approach  has  not  been 
used  in  any  current  commercial  systems,  although  the 
integrated  (i.e.,  microprogrammed)  channels  of  certain 
models  of  the  IBM  System/J7C  are  based  upon  similar 
concepts.  There  nave  oeen  i  tew  experimental  systems,  such 
as  the  VShUS  System  at  MITKB,  which  provides  processor 

functions  to  essentually  manage  the  paging 

microprogram  ining . 


system  via 
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4. 6. 1,2  "Intelligent”  Controllers 


Por  the  third  storage  level  and  beyond,  the  storage 
management  performance  requirements  are  much  more  modest 
since  aost  of  the  storage  activity  should  occur  at  the  first 
and  second  levels.  For  these  lover  levels,  it  is  pcssiole 
to  develop  independent  storage  aanageaent  control  facilities 
for  each  level.  This  can  be  accomplished  by  extending  the 
functionality  of  conventional  device  controllers.  Some 
recant  sophisticated  device  controllers  are  microprogrammed 
and  are  already  capable  of  performing  the  storage  management 
function  [  1  ], 


4,6,2  Multiprogramming 


Jp  to  now  ue  have  tacitly  assumed  that  the  processor 
bacoaas  idle  whenever  it  is  necessary  to  fetch  information 
from  the  storage  hierarchy.  This  may  be  a  reasonable  policy 
for  two-level  cache  systems  since  the  processor  is  never 
idle  for  aore  than  one  or  two  microseconds  at  a  time.  But, 
for  paging  systems  and  general  multiple  level  storage 
hierarchies,  the  processor  may  be  idled  for  periods  of 
Hundreds  or  thousands  of  microseconds  at  a  time.  It 
worthwhile  to  try  to  find  useful  work  for  the  processor 
while  the  storage  hierarchy  is  retrieving  the 
inf oraation. 


re  guested 
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In  lost  conventional  computer  systems,  processci  iale 
tine  is  utilized  by  multiprogramming.  This  requires  that 
taera  be  ■ultiple  programs  available  to  be  run.  Whenever 
one  prog  ran  Bust  be  delayed  due  to  a  time-consuming  storage 
ragueat,  the  processor  is  switched  to  another  program. 
Under  reasonable  circumstances  (e.g.,  many  programs  ready 
for  elocution  and  moderate  load  on  the  storage  system),  it 
is  possible  to  keep  the  processor  continually  busy.  Thus, 
the  effective  system  storage  access  time,  T' ,  will  very 
closely  approximate  T». 

Unfortunately,  the  process  cf  switching  execution  from 
one  program  to  another  can  result  in  a  considerable  amount 
of  processor  overhead.  For  example,  an  early  version  of  tne 
Hultics  operating  system  was  reported  to  require  10 
milliseconds  to  switch  programs;  typical  operating  systems 
require  up  to  1  millisecond.  The  time  reguired  to 
accomplish  this  multiprogram  switch  can  be  drastically 
reduced  if  the  multiprogramming  management  is  also 
incorporated  into  the  processor  along  with  the  rirst  and 
second  storage  level  management.  Although  the  particular 
purposes  were  different,  hardware  supported  multiprogramming 
nas  been  available  on  several  computing  systems,  such  as  tne 
Honeywell  630  series  [46]  and  more  recently  in  the  Singer 
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system  Tan  [JOJ.  lha  less  frequently  execute!  oj.er.jti/ij 
system  functions,  such  as  job  scheduling  a  mi  ti  mo-sharm  j 
management  algorithms,  can  do  supported  by  the  sot  twin.* 
operatinj  system  as  on  conventioal  systems  without  auvcts-.il/ 
affecting  perf ormance, 

4.7  Comments  on  the  Storage  hierarchy  System  Design 

This  chapter  has  presented  the  key  concepts  ot  a 
janeral  multiple  level  storage  hierarchy  system.  flany  or 
the  particular  details  of  the  system  mill  require 
considerable  investiqation  and  experimentation  to  determine 
an  optimal  implementation.  Three  important  factors  are 
extensively  studied  in  the  following  chapters:  (1)  other 
page  size  considerations,  (2)  removal  aljonthms,  anu  ( J) 
ralevant  models  for  program  reference  behavior. 


*■'*—■'*  1  ^ 
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CHAPTER  5. 

ANALYSIS  OF  PAGE  SIZE  CONSIDERATIONS 

b.C  In tro duct ion 

Dae  of  the  most  important  parameters  or  a  storage 
aieracchy  system  is  the  page  size,  the  unit  of  information 
transfer  between  two  levels  of  the  hierarchy.  In  this 
chapter,  the  factors  influencing  page  size  are  examined  from 
the  device  characteristics  viewpoint  ana  the  program 
behavior  viewpoint. 

5. 1  The  Page  Size  issue 

Du  contemporary  two-levei  paging  systems  (based  upon 
two  devices  similar  to  devices  2  and  4  of  Table  1) ,  the  page 
size  is  usually  guite  large  (typically  409 6  bytes  for  paging 
systems)  to  take  advantage  of  M^'s  large  transfer  rate  to 
compensate  for  its  slow  access  time.  Such  a  large  page  size 
is  justified  by  reliance  on  the  Principle  of  Locality. 
Considering  the  devices  of  Table  1  for  example,  a  single 
oyta  can  be  accessed  and  transferred  between  fl1  and  ft*  in 
about  5  milliseconds  whereas  4096  contiguous  bytes  can  be 
fetohad  in  7.8  milliseconds. 


only  56%  more  time. 
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3.  1.  I  Page  Size  Investigations 


Although  paging  systems  have  keen  useu  successfully, 
the  effect  of  page  size  has  become  the  subject  or  increasing 

investigation.  This  interest  has  been  aroused  due  to  several 
considerations: 

1.  It  has  been  noted  by  Denning  [2b]  that  the 
utilization  of  M »  is  maximized  and  "page  breaxage"  minimized 

by  using  rather  small  pages,  such  as  200  bytes.  In 
particular,  ae  emphasizes: 

'These  results  are  significant  ...  small  pages 

eff^ienCresLliaal  °£  Coa pcession  -ithout  loss  of 
erriciency.  Small  page  sizes  will  yield  significant 

improvements  in  storage  utilization  ..." 


The  success  of  cache  systems  indicates  that  the 

Principle  of  Locality  applies  on  the  microscopic  scale  as 

wall  as  the  macroscopic  scale  of  conventional  paging 

sy  stem  s. 

3.  The  recent  introduction  of  several  new  device 

technologies,  such  as  the  "semiconductor  drum"  [35]  with  an 
average  access  time  of  about  100  microseconds,  drastically 

reduces  the  benefits  of  very  large  page  sizes  in  a  pagino 
system . 

4.  Although  most  current  multilevel  systems  employ 
only  two  levels,  this  thesis  is  concerned  with  multiply 
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lavel  storage  hierarchies  (i.e.,  three  cf  oore  levels).  la 
tact,  storage  systems  witn  six  or  moLe  levels  are  ^aite 
plausible.  A  deep  understanding  ot  the  effects  of  vanous 
page  sizes  is  essential  to  the  development  ot  such  systems. 

Thus,  although  tnere  are  many  reasons  ter  considering 
new  page  sizes,  there  ts  not  a  complete  understanding  or  the 
impact  of  such  a  change.  Denning  [26]  sums  up  our  current 
Knowledge  as  follows: 

"Two  factors  primarily  influence  the  choice  of  page 
size:  fragmentation  and  efficiency  of  page-transpert 
operation.” 

In  this  chapter  some  other  factors  of  potentially  crucial 
importance  will  be  discussed. 

5.2  Anomalies 

Dne  of  the  more  intriguing  and  frustrating  aspects  of 
complex  systems,  such  as  paging  systems,  is  the  occurrence 
of  anomalies  (i.e.,  phenomena  that  are  contrary  to  "common 
sense").  Por  example,  Belady  [10]  has  shown  that  certain 
storage  management  reioval  algorithms,  in  particular  FIFO 
(tirst-in  first-out),  may  actually  cause  performance  to 
decrease  as  tue  capacity  of  t!1  is  increased.  Tnis  result  is 
contrary  to  the  general  belief  that  "more  mam  memory  maxes 
things  work,  out  better".  Thus,  one  must  exercise 
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considerable  care  when  considering  "tinkerin g"  with  the 

t  araao  te  rs,  such  as  page  size,  of  a  multilevel  storage 
system . 

The  objective  of  this  chapter  is  to  present  and  anaiyte 
some  anomalies  encountered  ehen  the  paye  size  parameter  is 
changed  in  a  paging  system. 


5.  3  The  Page  Sije  Anomaly 

For  simplicity,  let  us  start  by  consiaering  the  efiect 
ot  decreasing  the  page  size  used  in  a  two-level  system,  S, 
from  N  to  N '  where  =  N/2  in  this  new  system,  s*.  in 
particular,  we  wish  to  investigate  the  effects  upon  rhe 
failure  freguencies  which  are  f  and  f',  respectively.  lie 
define  the  ratio  £•/£  to  be  r.  The  possible  results  can  be 
partitioned  i.nto  three  interesting  regions: 

1 .  r  <  1. 

1  •  1  ■$  r  <  2. 

d  •  r  >  2 . 


b*  3‘  1  Cdse  1 '  r  <  1  (  r'  <  t  )  . 

This  would  be  a  highly  desirable  result  since  tae 
number  of  page  retches  is  actually  decreased.  Furthermore, 
he  time  reguired  to  access  and  transfer  a  page  of  size  N • 
would,  be  expected  to  be  less  than  that  reguired  ter  the 
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Pa£aaete£s 
As  sean  by  S: 

•  P  =  a,  to,  z ,  a,  b#  c 

•  I P I  =  6 

•  0  =  1  a,  o,  c  } 

•  101  =  3 

•  |M»|  =  2 

•  FIFO  Removal 

As  seaa  by  S': 

•  P  =  *+,  b*,  c*,  a  *#  b*, 

•  a  p  i  =  6 

•  Q  =  (  a*,  b*#  c*  } 

•  101  =  3 

•  I  M  *  I  =  4 

•  FIFO  Reaoval 


simutian 

Paga  Trace:  a*  to*  c*  a*  b*  c ♦ 

_5_ 

Patch:  ****** 

H*  Contents:  a  b  c  a  to  c 

a  b  c  a  to 

Patch:  *  *  * 

ll  Contents:  a*  b*  c*  c*  c*  c* 

a*  b*  b*  b*  b* 
a*  a*  a*  a* 


rf 3  salt s 

•  F  =  6 

•  F'  =  3 

•  r  =  3/6  =  0.5 


c* 


Figure  6. 
Example  cf  Case  1 
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l>c9ec  page  size  *.  Figure  b  illustrates  au  instance  of  this 
-tee.  is  converting  au  address  trace  to  a  page  trace  tot 
the  logical  page  addresses  p.  and  p-  are  used  to  represent 
tie  tuo  halves  of  the  page  p  of  size  a.  Note  that  .sen  using 
a  Page  size  of  N/2  instead  of  ».  ..  actually  „0lds  t.ice  as 
sany  pages  though  eaca  page  is  only  half  as  large. 

In  the  exa.ple  of  Figure  6.  r  =  0.5,  vhich  beans  that 
the  nu.ber  of  page  fetches  .as  cut  in  half  ny  using  the 
saallar  page  size  a*.  This  type  of  result  bight  be  ezpectea 

£C°‘  *  thlt  efhibited  a  rather  sparse  and 

non-localized  reference  behavior.  Recall  that  in  typical 
teo-lavel  paging  systens,  a  page  of  size  uONb  bytes  is 
fetched  even  though  a  single  reference  uses  only  a  feu 
bytes,  unless  the  prograi  ia.ediatel,  nakes  aany  bore 
references  to  this  page,  much  of  it  uill  have  been  fetched 
out  not  used,  under  these  circu.stances,  .ig„t  be  better- 
utilized  by  holding  a  larger  and  sore  diversified  collection 
of  pages,  even  if  each  page  were  smaller. 


5.3.2  Case  2:  1  <  r  <  2  (  t  <  f  s  2f  ) 

This  is  a  transxtioual  region.  For  r  =  1,  s*  will 

i,arf°C"  bQtt6r  thd“  *  sincs  the  nuaber  ot  page  fetches  is 
the  same  and  the  time  required  for  each  fetch  is  less.  For  c 

=  2,  S’  -ill  require  twice  as  many  page  fetches.  This  will 
usually  swamp  any  page  transfer  benefit  derived  from  the 
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smaller  page  size,  thus  S  would  perform  better,  The  specific 
point  of  transition,  r*,  depends  largely  upon  the  time 
required  to  access  ana  transfer  a  page,  T  ana  I' 
respectively  in  S  and  S',  such  That  r'  =  T/l ' . 

Pigure  7  illustrates  an  extreme  example  of  Case  2  wuere 
r  =  2.C.  This  means  that  the  number  of  page  fetches  was 
ioublad  by  using  the  smaller  page  size  N '  •  This  type  of 
result  might  be  expected  from  a  program  that  exhibited  a 
danse,  localized,  and  seguential  reference  behavior. 

Intuitively,  the  r  =  2.0  result  is  the  "worst"  case 
sinca  we  are  being  forced  to  always  load  both  the  p*  and  p- 
nalves  of  each  original  page  p,  thereby  losing  all  the 
banefits  of  the  smaller  N'  page  size  and  incurring  twice  as 
many  actual  page  faults.  This  intuitive  observation  is 
false;  r  =  2.0  is  not  the  "worst"  case. 

5.  3.  3  Case  Js  r  >  2  (  t  •  >  2f  ) 

This  third  region,  besides  being  intuitively 
iipossible,  is  clearly  undesirable.  Since  the  number  of  page 
fetches  required  would  be  more  than  doubled,  the  performance 
of  S'  would  be  undoubtedly  worse  than  S.  Depending  upon  the 
actual  value  of  r,  the  performance  could  be  much  worse. 
Figure  8  illustrates  a  reference  pattern  that  produces  a 
result  of  r  =  2.75.  This  region  of  operation  will  be  the 
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Pinasters 
As  seen  by  :j; 

•  P  =d#a#b#b,c,c 

•  I P I  =6 

•  U  =  (  a,  b,  c  } 

•  121  =3 

•  I M  1  |  =  2 

•  FIPO  Removal 

As  seen  by  s': 

•  ^  =  a~#  b*,  b~ ,  c+,  c~ 

•  |P|  =  6 

J  =  (a*,  a-,  b+ ,  b~,  c*.  c~  } 

•  121  =  b  1 

•  |fl»|  =  4 

•  FIFO  Reaoval 


Siaiiition 


Page  Trace; 

-5. 

Fateh; 

!i4  Contents: 

-51. 

Fateh: 

fl1  Contents: 


a  *  a-  o*  b~  c  *  c~ 

*  *  * 

a  a  b  o  c  c 

a  a  b  b 

*  *  *  *  m  * 

a*  a-  b+  b-  c+  c~ 

a*  a-  b*  b-'  c  + 

a*  a-  b«  b- 

af  a-  b* 


Rasults 


•  P  =-  J 

•  F*  =  G 

•  r  =  6/3  =  2.G 


Figure  7. 
Exanple  of  Case  2 
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(to) 


Parameters 
As  seen  by  S: 

•  P  =  a,  o#  a,  b,  c,  c,  b,  a,  a,  c,  c 

•  IPI  »  11 

•  Q  =  (  a,  b,  c  ) 

•  121  =  3 

•  |H*|  =  2 

•  FIFO  Removal 

As  seen  by  S': 

•  F  =  a*,  b*,  a-#  b-f  c+,  c~,  b*,  a+,  a~,  c*.  <r 

•  I P I  =  U 

•  a  =  (  a*-,  a-,  b*,  b-,  c*,  c-  ) 

•  121  =6 

•  |  M 1 1  =  4 

•  FIFO  Removal 


^kailation 


Page  Trace: 

-S- 

a* 

b« 

a~ 

b- 

c+ 

c~ 

b* 

a* 

a~ 

c* 

c~ 

Patch: 

* 

* 

* 

* 

M1  Contents: 

a 

b 

b 

b 

c 

c 

c 

a 

a 

a 

a 

a 

a 

a 

b 

b 

b 

c 

c 

c 

c 

_§i 

Fateh: 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

* 

Coutents: 

a+ 

b+ 

a- 

b- 

c* 

c~ 

b* 

a* 

a- 

c* 

c- 

a* 

b* 

a~ 

b- 

c* 

c~ 

b* 

a* 

a- 

c+ 

a* 

n* 

a~ 

b- 

c* 

c~ 

b+ 

a* 

a~ 

a* 

b* 

a~ 

b- 

c+ 

c~ 

b* 

a* 

Resales 

•  F  =  4 

•  F'  =  11 

•  r  =  11/4  =  2.7  5 


Figi^re  8. 
Example  of  Case  3 
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uabjact  of  discussion  for  the  remainder  or  tms  chapter. 
normalize  this  situation  by  the  following  existence  theorem. 


TH208EH  Is  ‘ 

There  exists  a  page  trace,  P,  and  aemaud-ietch 
FIFO-removal  two-level  storage  systems,  S  and  S',  with 
page  sizes  N  and  N'=N/2,  respectively,  such  that  the 
ratio,  r,  of  fetch  frequency  f'  to  f  exceeds  2. 

Proof: 

By  example  (Figure  d)  . 


5.3.4  Other  Removal  Algorithms 

• 

Theorem  1  states  the  anomaly  that  decreasing  page  size 
ny  a  factor  of  two  can  cause  the  page  fetch  frequency  to 
increase  by  more  than  a  factor  of  two.  The  two-level 
demand-fetch  conditions  of  Theorem  1  are  typical  of  most 
contemporary  paging  systems.  But,  to  put  this  situation  into 
perspective,  other  removal  algorithms  must  be  considered, 
due  to  its  simplicity,  the  FIFO  removal  algorithm  was  used 
in  many  of  the  early  paging  systems.  In  recent  times  it  has 
oeen  found  tnat  FIFO  has  certain  disturbing  peculantie^ 
(e.g.,  the  system's  success  frequency,  s,  is  not  a  monotonic 
function  of  primary  store  size,  |M»|  [10J).  Furthermore, 
other  removal  algorithms  have  been  found  to  be  empirically 
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closer  approximations  to  the  "optimal"  removal  algorithm, 
MIN  [11].  MIN  itself  is  not  physically  realizable  since  it 
requires  future  knowledge,  but  it  can  be  used  as  a  basis  ior 
performance  comparison  with  practical  algorithms. 

Various  forms  of  the  "least  recently  used"  (LhU) 
removal  algorithm  have  become  popular  in  contemporary 
systems.  Under  LHU,  the  page  selected  tor  removal  from  the 
primary  store  is  the  one  that  has  net  been  referenced  for 
the  longest  time  (i.e.,  the  least  recently  used  page). 
Empirically,  LHU  has  been  found  to  closely  approximate  the 
performance  of  the  "optimal"  algorithm  for  many  actual 
programs.  Furthermore,  Mattson  et  al  [63]  have  studied  LHU 
and  found  that  it  is  a  member  of  a  general  class  of  removal 
algorithms  called  "stack  algorithms".  The  class  of  stacx 
algorithms,  as  noted  by  Denning  [25],  "contains  all  the 
'reasonable'  algorithms".  In  particular,  stack  algorithms 
all  satisfy  an  inclusion  property  that  results  in  well 
nahavad  characteristics.  For  example,  it  has  been  proven 
that  all  stack  algorithms,  including  LHU,  have  a  success 
frequency  that  is  a  monotonic  function  cf  primary  store  size 
and  immune  to  the  FIPO  peculariarity  observed  by  Belady, 
Thus,  one  might  be  tampted  to  assume  that  the  page  size 
anomaly  is  also  a  phenomenon  unique  to  FIFO  removal  and 
would  not  occur  if  a  "well  bahaved"  removal  algorithm,  such 
is  LHJ,  were  used.  This  expectation  can  be  rapidly  destioyeu 
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oy  observing  Figure  9,  wnich  is  the  same  system  as  figure  a 
oat  Kith  an  LRU  reaovai  algorithm,  in  this  example,  the  page 
fetch  frequency  ratio,  r,  is  2.2  which  still  exceeus  2.  This 
result  loads  us  to  Theorem  2  and  Corollary  2a. 


THEOREM  2: 

There  exists  a  page  trace,  P,  and  demand-fetch 
Li<U-removal  two-level  storage  systems,  a  and  with 

page  sizas  ti  and  N'=N/2,  respectively,  such  that  the 
ratio,  r,  of  fetch  frequency  f'  to  f  exceeds  2. 

Proof: 

By  example  (Figure  9)  . 

COROLLARY  2a: 

jiven  a  page  trace,  p,  and  demand-fetch  two-level 
storage  systems,  S  and  S»,  with  page  sizes  N  and 
N '  =  N/2,  respectively,  the  use  of  a  ••stacJc"  removal 
algorithm  (i.e.,  an  algorithm  with  the  "inclusion 
proparty")  is  not  sufficient  to  guarantee  that  the 
ratio,  r,  of  tetcn  frequency  f*  to  f  will  be  bounded  by 


2. 
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(t7) 


Parameters 


As  seen  by  S: 


P  =  a,  b,  a,  b,  c,  c,  b,  a,  a,  c,  c 
J  PI  =11 
U  =  [  a,  b,  c  j 
121  =  3 

I  M  1 J  =  2 
LRU  Removal 


As  seen  by  S': 


jpi  -  ?;•  b*'  a_'  b_'  c+r  c_r  b*'  a+'  d"'  c*'  C~ 

2  =  l  a4,  a-,  b4,  D~,  c4 ,  c-  } 

121  =  6 

|M»|  =  4 

LRU  Removal 


Si malation 


Page  Trace: 
_S_ 

Fetch: 

it1  Contents: 


-51- 

Fetch: 

H*  Contents: 


a*  b4  a~  b-  c4 

*  *  * 

a  b  a  b  c 

a  b  a  b 

*  *  *  *  * 

a*  b4  a-  o ~  c4 

a*  b4  a-  b~ 

a4  d4  a- 
a4  b4 


c-  b4  a4  a-  c4  c~ 
*  * 

c  b  a  a  c  c 

b  c  b  b  a  a 

****** 

c-  b4  a4  a~  c4  c- 
c4  c-  b4  a4  a~  c4 
b-  c4  c~  b ♦  a4  «- 
a-  b~  c4  c~  b4  a4 


Sis uits 

•  F  =5 

•  F •  *  11 

•  c  J  1 1/5  «  2. 2 


Figure  9. 
dttajvR*  ut  C as*  J 
({or  LkJ  Hetovai) 
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r..e  previous  thaorens  prove  that  there  exist  paje 
traoas  that  result  iu  siynitic.ntly  increases  pa,,„  retch 
Ireyuancies  it  the  paja  size  is  decreased.  It  is  necessary 
tu  consider  the  livelihood  of  encountering  such  pa,*  trace 
patterns  in  actual  projra.s.  For  exanple,  it  can  no  proven 
that,  as  you  are  readinj  this  sentence,  ail  the  .olecules  or 
air  in  the  roo.  nay  suddenly  ,0,e  towards  tne  opposite 
corner  and  cause  you  to  sutfocate.  It  ,0u  sarvivea  the  last 
santeace,  you  have  probably  deduced  that  the  UKeiinocd  or 
teat  aveut  is  ertreneiy  s.all,  lcrtunately. 


3.4.1  Simulation  Studies 

d  at  field  [adj  and  deli^.an  ( 7d )  have  perfor.ed 
erperi.ents  tnat  indicate  that  the  pa,o  sire  ano.aly  m  vet, 
co.non ,  tt  „ot  in.vit.hle,  in  actual  pro,ra.a.  In  both  cases 
i.-tuai  pr09r.au  wore  .Oh notes  and  tneir  coriospondin,  pa  ,e 
trac,  reference  «tri»4u  eei.  record,!,  u.u.Ur  on  .adnenc 
tape.  rben  si.alalots  .0.  *  a,  u,  ed  ,„at  ,i.,CkU0  lM 
•Ofte.re  .nl  ha.J.ar*  ot  ...  t.c-I.,,1  .foie,* 

1-  ««»  or  oetni  con.inroj.  .uppl/t,,,  u„  NUII|I, 
ttacj  ,u  „pn,  l0  ,.v  .i.ui.tot.,  the  pel lot.aace  n  „oci 

" l’*  »*  .tc.f.l.l,  w,..|«4,  si.aUtci.  ..... 

.crepno.ti, 
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of  these  results  have  been  confirmed  in  some  cases  ny 
nnning  the  real  programs  under  a  real  two-level  storage 
system . 


5.4.2  Hatfield  Studies 

Hatfield  [ 48 J  performed  studies  in  the  haraware 
environment  of  the  IBM  System/360  Hodel  67  with  programs 
rinning  under  the  CP-67/CHS  Operating  System.  The  simulated 
performance  was  measured  for  various  page  sizes,  N ,  and 
various  primary  store  sizes,  |H»|.  In  summary,  it  was 
confirmed  tmat  certain  programs,  which  were  viewed  as 
examples  of  low-density  storage  use,  resulted  in  decreased 
page  fetch  frequency  when  page  size  was  decreased.  But,  it 
was  observed  that  tor  programs  with  much  greater 
localization  of  heavily  used  storage: 

"not  oniy  does  the  smaller  page  size  often  generate 
nearly  twice  as  many  page  fetches  as  the  large  page 
size,  it  often  resulted  m  more  than  twice  the  page 
fetches,  contrary  to  our  intuitions. " 

In  particular,  the  substantially  increased  page  fetch 
frequency  appears  to  he: 


"a  characteristic  of  programs  which  have  a  high 
locality  and  therefore  perform  well  on  systems  using 
relocation  hardware  for  address  translaticn  and  is 
cnaracterist ic  of  those  programs  in  the  region  of 
low  paging  rate." 
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“  °el“er  ’0t4s'  the  *no“llr  is  »ost  prevalent  in  program 
"nptiiized"  for  perfor.ance  in  a  teo-level  storage  nyste. 
•ken  running  under  nearly  “opti.al"  conditions! 


5.4.3  Seligaan  Studies 


Choreas  Hatfield  was  concerned  with 
with  page  sizes  in  the  range  froi  2C4H 
Saligaan  £  78  J  analyzed  a  proposed  cache 
smaller  page  sizes  in  the  range  of  8 
observed  that: 


a  paging  system 
to  16384  bytes, 
system  with  much 
to  256  bytes.  He 


"interestingly, 
this  data)  is 
increases  slowly 
the  associative 
eguals  one  word, 
froa  econoaics, 
words  fetched  in 
displaced" • 


the  njssing  page  probability  (for 
■iniaized  for  a  page  size  which 
with  total  eeeory  size.  Note  that 
■*iory  organization,  where  page  size 
is  not  optima;  tc  borrow  a  phrase 
tae  aarginal  utility  of  the  eztra 
4  page  is  higher  than  that  of  those 


raus.  continual  d.ct.mog  of  pag«  niie  „ppe.„  to 
in.vit.bl.  ndv.ra.  .ff.ct  upon  .y.t..  p.rfor..oc.. 


j.h.4  Other  t)ue*tioas  la l wed 


•»»  >k«t  It  t.a  ....  .ao..  UM  U.  p,,.  .J„ 

i*  ta.or.uc.ll,  p.a.ibl.  a. i  „  rt4eu„f 

la.r.  ...  a.v.t.l  ota.r  .........  iltc,  „ 

*•«»  nam  tail  la.  (.,c» 


fatto  t»  aat 


Storage  Hierarchy  Systeas 


102 


bounded  by  r  »  2,  what  bounds,  if  any,  do  exist?  Hatfield 
implicitly  raised  another  question  by  the  stateaent: 

"as  yet  we  have  baan  unable  to  prove  that  there  is  a 
rapiaceaent  algorithm  using  only  the  past  history  of 
page  requests  which  cannot  generate  aore  than  twice 
the  exceptions  with  half  size  pages," 

Tne  answers  to  thesa  questions  are  the  subjects  of  the 
following  sections  and  chapters. 

5.  5  Bounds  on  the  page  Fetch  Frequency  Ratio 


It  has 

been 

shown 

that 

the  page 

fetch  frequency 

ratio 

can  exceed 

r  =  2,  but 

just 

how  bad 

can  it  get?  Of 

equal 

iiportance. 

what 

fac 

tors 

inf luence 

this  bound? 

These 

questions  will  be  discussed  in  this  section. 

5.5.1  Cyclic  Page  Traces 

I 

Figures  10  and  11  represent  page  trace  sieulaticos  tor 
two  sets  of  deeaod-f etch  LRU-resoval  two-level  storage 
tystees  with  prtaary  store  sizes  |lf*|-2  and 
respectively,  in  botn  cases,  it  can  be  observed  that  the 
ptge  trace  linulateJ  is  cyclic  vith  a  repeated  pattern.  Pc. 
li  Figaro  10,  the  page  trace  consists  o(  the  repeated 
pt  item: 

Pc  «  »•  c*  C”  h“  a“ 
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£l££I§£§£§ 

As  seen  by  S: 

•  I P I  =  12  b'  C'  **'  d#  a#  **'  C#  Cf  br  d 

•  2  =  (  a#  b#  c  } 

-  101  =  3 

•  I M 1 1  =  2 

•  LRO  Reaoval 

As  seen  by  s*: 


I p |  “  y/  b *•  c*#  c~'  b~'  a”»  a*»  b+,  c+,  C-, 

Q  -  {  a*#  a-,  b*#  b-#  c+#  c~  } 

I Ql  =6  1 

|M»|  =  4 
LRU  Reaoval 


Jo-,  a- 


Sif!i£lti2S 


Page  Trace: 

-5. 

Fateh: 

H1  Contents: 

-51. 

Fateh: 

N*  Contents: 


a*  b* 


transient 
cycle 
c^  c 


* 

a* 


~  b~  a-  a ♦  b* 


* 

b 

a 

* 

b* 

a* 


« 

c 

b 

* 

C* 

b* 

a^ 


c 

b 


c* 

b* 

a* 


steady-state 
cycle 
c*  c 


b 

c 

* 

b- 

c- 

c* 

b* 


* 

a 

b 

* 

a- 

b- 

c- 

c* 


a 

b 

* 

a  ♦ 
a- 


b 

a 

* 

b* 

a* 


b-  a~ 
c-  b- 


-  b-  a~ 


* 

c 

b 

* 

c* 

b» 

2* 

a~ 


c 

b 

* 

c- 

c* 

b+ 

A* 


b 

c 

* 

b- 

c- 

c^ 

b* 


* 

a 

b 

* 

a- 

b- 

c- 

c* 


saae 


iissyUs 


F  ■  6 
F*  ■  12 
r  •  12/6 


2.3 


For  the  s teady-state  cycle: 

•  F  ■  2 

•  F'  ■  6 

•  /c/  «  6/2  -3.0 


Figure  10. 

Cyclic  Paje  Trace  «itb  J|»»|  ■  2 
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whereas  Figure  11  repaats  the  siailar  pattern: 

Pc  =  a*  b*  c*  d*  d~  c~  b~  a~ 

5.5*2  Steady  State  Cyclic  Page  Traces 

Let  us  consider  Figure  10  first.  The  page  fetch  ratio, 
r,  is  2.0  in  this  case.  As  noted  earlier,  the  page  trace  can 
ba  subdivided  into  aa  initial  transient  stage,  pt,  with  a 
high  page  fetch  freguency  followed  by  a  steady-state  stage, 
Ps,  with  usually  a  lower  page  fetch  freguency.  In  Figure  10, 
the  first  Pc  cycle  contains  the  entire  start-up  transient 
stage  and  coapletely  fills  all  the  available  space  in  H>. 
Thus,  the  second  Pc  cycle  represents  the  start  of  the 
steady-state  stage.  Furthermore,  since  the  content  and  page 
ordering  of  a*  is  exactly  the  saae  at  the  end  of  the  second 
cycle  as  they  were  at  the  beginning  of  that  cycle  for  both  S 
and  S',  the  page  trace  cycle.  Pc,  can  be  repeated 
continuously  with  exactly  the  saae  results  each  time  for 
page  fetch  reguests  and  fl*  contents.  If  /r/  is  defined  to  be 
the  page  fetch  freguency  ratio  for  the  first  steady-state 
period.  Pc,  of  a  cycLic  page  trace,  (Pc)*,  /r/  is  also  the 
page  fetch  freguency  ratio  for  the  entire  steady-state 
portion  of  the  page  trace  defined  by  the  regular  expression: 

P  =  Pt*Ps  =  Ft*  (Pc) * 

Aa  the  length  ot  the  page  trace,  |P|,  becoaes  large  in 
coapacison  with  the  length  of  the  transient  stage,  |Pt|,  the 
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overall  page  fetch  frequency  ratio,  r,  asymptotically 
approaches  the  value  of  the  steady-state  cycle  page  fetch 
frequency  ratio,  /r/.  in  Figure  10,  /r/  =  3.0,  thus  r  vill 
increase  from  2.0  towards  3.0  as  the  page  trace  is 
lengthened  by  continually  repeating  the  pattern  Pc.  ihus, 
the  page  fetch  freguency  ratio,  r,  fcr  the  page  trace 
P  =  (  a*  b*  c+  c-  b-  a~  )  * 

is  bounded  by  3.0  when  |H»|  =  2. 

4  siailar  situation  is  illustrated  in  Figure  11.  in 
tuis  exaapla,  r  =  2.23  and  /r /  =  4.0.  Thus,  the  page  fetch 

frequency  ratio,  r,  for  the  page  trace 

P  (  a*  b*  c*  d*  d-  c-  b~  a-  )  * 

is  bounded  by  4.0  wheu  I fl 1 1  =  3.  By  generalizing  these 

examples,  tie  arrive  at  Theorem  3  and  Corollary  3a. 


IHEOtiBM  3:  ' 1 

For  any  two  deaaud-fetch  LBU-reaoval  two-level  storage 
systeas,  S  and  3',  with  page  sizes  h  and  N'=N/2  and 
priaary  store  sizes  Id1)  and  | H 4 | *=2 | ,  respectively, 
there  exists  a  cyclic  page  trace,  P  =  (Pc)*,  where  |Pc| 
2 (IN1 |+1),  such  that  the  steady-state  page  fetch 
frequency  ratio,  /r/,  equals  |M*|+1. 

Proof: 


(See  below)  . 
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£i£li§£§£S 
As  seen  by  S: 

•  !P|  =  ^b'c<1<'*.=.*>.a.a.fc.  =  ,<M,c,b,ii 

•  191  =  4  *'  b'  ='  d  1 

•  |B»j  =  3 

•  LB U  Beaoval 

As  seen  by  s*: 

•  |  Pi  =  **'b*'  =  *'d*»d~*c-,b-,a-,a*,b*,c*,<i*,d-,c-,b-,a- 

•  IQI  =  8  **'  a"#  b+f  b"'  CS  d  +  '  d‘  } 

•  i a1 i  =  6 

•  LRU  Removal 


^isaiation 


Page  Trace: 
Patch: 

B»  Contents: 


_§i_ 

Petch: 

B1  Contents: 


i  <- 


transient 
cycle 


P 


■>  In¬ 


stead  y-state 
cycle 


■*1 


a*  b*  c+  d*  d-  c-  b-  a-  a*  b*  c+  d*  d-  c~  b~  a- 


*  *  *  *  * 

abcddcba 
a  b  c  c  d  c  b 

a  b  b  b  d  c 


*  * 

abcddcba 
babccdcb 
cc  abbbdc 


*  *  * 
a*  b*  c* 
a* 


* 

i* 


T  T  T 

d-  c-  b-  a-  a+  b+  c+  d*  d~  c~  fc-  a- 


b+  c+  d*  d-  c~  b-  a-  a*  b+  c+  d*  d~  c~ 


a*  b*  c* 
a* 


d*  d~  c_ 


b*  c*  d  ♦  d- 
a  + 


a- 

b- 

c- 


a~  9*  b*  c *  d+  d- 
b~  a-  a+  b*  c+  d* 


b*  c*  d*  d-  c-  b-  a-  a*  b*  c* 
a*-  b*  c*  d*  d~  c-  b~  a”  a ♦  b* 


* 

a- 

b- 

c~ 

d- 

d+ 

c+ 


Results 


F  =  7 
P*  =  16 

r  =  16/7  =  2.28 


sane 


For  the  steady-state  cycle: 

•  F  =  2 

•  F '  =  8 

•  /r/  =  8/2  =  4.0 


Figure  11, 

Cyclic  Page  Trace  with  |M»|  =3 


Storage  Hierarchy  Systems 


107 


J7ROLLA8Y  3a: 

toe  any  two  demand-fetch  LRU-reaoval  two-level  stoiage 
systems,  s  and  S',  witn  page  sizes  ti  and  N'«N/J  and 
priaacy  store  sixes  |fi‘|  and  |  N » |  •  -2|  n  1 1 ,  respectively, 
there  exists  a  cyclic  page  trace,  F  *  (Pc) •,  where  |Pc| 
2(|fl*|ai),  suca  that  the  overall  page  teten  frequency 
ratio,  r,  asyaptotically  approaches  the  bound  as 

|P|  approaches  infinity. 


5.5,3  Proof  of  Theorea  3 

5,5,3, 1  Notation  and  Proper 

Assume  a  fixed  page  sire  N  and  pri^ty  store  ci  size  S*,  let 
n  *  tae  nuaber  of  pages  in  n  =  |M|  =  S»/N)  ,  It 

has  been  shown  by  Mattson  et  i  [63J  that  a  deaand-fetch 
LiU-reaoval  algor it h a  has  the  fallowing  properties: 

PI.  If  N*  is  initially  @*pty,  it  tills  with  the  first 
n  distinct  pages  referenced  by  the  trace. 

P2.  At  any  time  t,  N‘  contains  the  n  aost  recently 
reference!  distinct  pages. 

P3.  a)  LflU  satisfies  the  i nclusicn  pro pert v 

B»  (1)  C  N 1  (2)  C  ...  C  N 1  (a) 
where  II1  (1)  means  the  contents  of  H»  if  a- 1 , 


etc. 
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b)  At  any  tiae  t  after  M  has  becoae  filled,  tnure 
is  a  strict  reaoval  ordering  referred  to  as  the 

ItlU  £tacl 

S  =  i  3(1),  s (2)  ,  s  (n)  ) 

where 

s  (i)  =  m  1  (i )  -  (l‘  (i-1)  for  i  =  1,  2,  n 

and  s(n)  is  the  page  tc  be  removed  next. 


5. 5. 3. 2  Definition  3-a: 

For  any  integer  n,  let  us  consider  a  page  trace,  P°, 
consisting  of  the  repeated  pattern,  Pc°,  of  length  |Pc®l  - 
2(n»1> 

P®  =  pc°[n]* 

where 

Pc°;n]  =  {  Pc°(1),  Pc®  (2),  ...»  Pc®  (2n*  1) ,  Pc®  (2n*2)  j. 

The  Po®(i)s  are  defined  as  follows: 

12  (i-  1)  tor  i  =  1,  •••  /  n+1 

4n*S-2i  tor  i  =  n+2,  ...,  2n *2 
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corresponding  cyclic  page  trace  patterns.  Pep  n  ]  and  Pc'^nj, 
for  S  and  S',  respectively.  These  are  defined  as  follows  -- 
For  a  given  value  of  n  and  i  =  1,  2,  ...,  2n*2 
Pc  (i)  =  integer’  Pc»  (i)  /2  ] 

((integer!  Pc°  (i)  /2  ])  ♦  if  ren£  Pc°  (i) /2  ]=0 

(integer[  Pc®  (i)  /2  ])  -  if  rei[  Pc<»  (i) /2  J=  1 

Thus,  for  n  =  2  — 

p[2]  =  [  0,  t,  2,  2,  1,  0,  0,  1,  2,  2,  1,  0,  ...  } 

P'[2]  =  (  0+,  !♦,  2*,  2-,  1-,  0-,  0*,  !♦,  2*,  2",  1-,  0~, 

•  •  •  ) 

Ha  can  see  that  these  page  traces  are  identical  to  the  page 
traces  of  Figure  8  with  appropriate  relabeling  (i.e.,  a=0, 
b=1,  c=2) . 


5. 5. 3. 3  Leaaa  3- b : 

The  page  references  of  the  set 

l  Pc(1)  ,  ...,  Pc(n*1)  J  >, 

are  distinct. 

Proof: 

Based  upon  the  definitions  cf  Pc«[  n  ]  and  Pc[nJ,  we  see 

that 

For  i  *  1,  ...,  n* 1 
Pc(i)  =  integers  Pc°  (iJ/2] 

=  integer[  2  (i-1)/2  J 
*  integer! i- 1  J 
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=  i-1 . 

rhus,  each  value  of  Pc(i)  for  i  =  1,  . ..,  n+1  is  distinct. 

U  •  E  *  0. 


5. 5. 3, 4  Lena  3-c: 

The  page  references  of  the  set 

!  Pc  (n*2)  ,  . .  .,  Pc  (2n*2)  ) 

are  distinct. 

Proof: 

Based  upon  the  definitions  of  Pc®[n]  and  Pc[n],  tie  see 

that 

Por  i  *  n*2,  ...,  2n*2 
Pc(i)  *  integer!  Pc®  (i)/2  ] 

*  integer!  (4n«-5-2i)/2  3 

*  integer! 2n»2» (1/2) -i j 

*  2n*2-i 

Thus,  each  value  of  Pc(i)  for  i  =  n*2,  2n»2  is 

distinct. 

U  .P.D. 


S.  5.  3. S  Lea id  3- 1 : 

it  the  end  of  each  cycle,  Pc(n),  of  the  page  trace, 
P(nJ,  .1*  contains  the  pagea,  in  LIU  stack  order, 

S°  ■  (  »®(1),  ...,  s® (n)  ) 


Ill 


^.iniiinvwv'iwrriiMii'm 
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where 


s°(3)  =  j-1  for  j  =  1 . . 

Proof: 

since  each  cycle,  Pc[nj,  c£  P[n]  is  of  length  2n^ 
waich  is  greater  that  n,  the  S°  LHU  stack  consists  or  the 
last  n  page  references  of  Pc[a]  in  reverse  order  by  property 
P2,  P 3 ,  and  Leana  3-c,  Thus, 

s°(J)  =  Pc(2n*3-j) 


such  that 


s°<1)  =  Pc(2n*2),  s°  (2)  =  Pc(2n*1) . s°  (u)  =  Pc(n*3). 

-hen  j  takes  on  values  (  1 .  n),  2n*3-j  takes  on  values 

1  2a*2 . n°  i*  rhus«  for  j  «  1,  n  and  based  upon 

Laaaa  3-c: 


s°(J)  *  Pc  (2n*3- j) 

*  2n*2-  (2n*  3- j) 

-  J-1. 


0*  £•  0* 


S.S.3.6  Loaaa  3-#: 

3iven  a  deaaod-fetch  LHU-reaoval  two-level  storage 
systea,  s,  with  page  size  a,  ptiaary  atore  nize  s* 
contaiaiag  pegws,  the  page  fetch  function,  **, 

resulting  froa  n:h  steaiy-ntate  cycle,  pc(o),  of  the 
page  trace  P  has  the  value  2  ft,#.,  r(Pc(a)j*2  ounn., 

steady  ntate)  . 
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Proof: 

Let  us  subdivide  the  Pc[n]  cycle,  which  is  cf  length 


2 n*2,  into  four 

regions 

as  follows: 

Region 

1: 

Pc»  = 

{  Pc(1)  ,  ...,  Pc  (n)  j 

Begion 

2: 

Pc*  = 

(  Pc(n*1)  ) 

Region 

3: 

Pci  = 

{  Pc  (n*2) ,  ...,  Pc  ( 2n*  1)  ) 

Region 

4; 

Be*  = 

(  Pc(2n*2)  }. 

and  coapute  the 

Quabar 

of  page 

fetches  in  each  region. 

*  respectively.  Since  the  page  trace  regions  are 
concatenated,  the  page  fetches  are  cuaulative,  so  we  know 
that 

P  =  F»  ♦  F*  ♦  FJ  ♦  F*. 

Bagion  1:  Pc»  »  t  pC(1)  ,  . ..,  Pc(n)  J 

Froe  Leaaa  3-b,  we  know  that 

Pc(i|  ■  i-1  i  ■  1,  n*  1 
and  froe  Leaaa  3-d,  wa  know  that  at  the  beginning  of  each 
cycle 

•*(J)  ■  j"1  3  ■  1#  •  ••,  o. 

rae  pigo  references  (  Pc(1),  Pc(n)  )  are  actually  the 

ovquoace  (  0,  n-1  J  which  is  identical  to  the  contents 

of  H»  at  the  start  of  the  cycle,  S®.  Therefore,  no  page 
transfers  are  require!  although  LIU  stack  reordering  eay 
o:cut.  (tiiJl. 

•ugioi  2:  Pc*  •  (  Pc(o»1)  ) 

Page  reference  P:(n*1)  is  page  n  which  is  net  contained 
la  5®  nor  iotded  doting  region  1  (in  fact,  no  pages  were 
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fstchad  during  region  1,  ,  thus,  a  page  transfer  is  reguirea 
(E£:l(  .  Using  sinilar  techniques  as  in  len.a  d-d,  since  each 
reference  of  Pc.  is  distinct,  the  LBU  coral  stack  at  this 


point  is 


where 


s  =  {  s  ( 1)  ,  . .  . ,  s  (n)  } 


s(j)  =  PC(Q*1-j) 


j  =  1 


*  •  •  •  t  Q  e 


P.ge  3  (o)  selected  for  re.o.al,  this  is  actual!,  page 
Pc  (n»1-n)-Pc(l).o.  Th.  ne.  uu  stack  ordering  heco.es 


S(j)  =  Pc  (n*i- j) 


J  •••#  Q i 


htgion  3:  Be*  *  (  pc(n*2) . Pc(2n.1)j 

th,  page  references  (Pc(«a2) .  PclJn.i), 

•ctu.ll,  th.  sequence  (  . .  1  ,  shoe,  in  th.  proof  of 

La...  3-b.  the  uo  stack  ordering  i.nedi.t.l,  prior  to 

raferance  Pc(n *2)  is 


which  is  actually 


s##  *  l  MU,  s (a)  | 


•  •  •  # 


U  b"  b***  *«“»r  that  at  reference  pe(n.d) 


■  Pc(ow-j) 


i  *  1*  •••#  a. 


r*»*.  a.  in  region  I,  ..or,  pa,,  referenced  in  .need, 
contained  to  V  end  there  ere  no  peg,  tr.n.f.r.  requited 

(Z’:2> . 

•:  Pc*  •  j  Pc(2a*J)  | 

Ul*  '-M2..21  !•  act.,11,  page  0.  Thin  peg. 

MS  ,Jl  *•  «••*  “»»  *  Fa,,  traamer  in  required 
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(Eifi)  . 

ve  can  conclude 

P»(Pc»]  ♦  F*[Pc*]  ♦  F*[Fc*]  ♦  F*[ Pc4  ] 

0  ♦  1  ♦  0  ♦  1 

2. 

Q.  E.D. 


Therefore, 
F[Pc[a]J  = 


5. 5. 3. 7  Lena  3-t: 

Siveo  a  desand-f etch  LBU-reaoval  two-level  storage 
systen,  S',  with  page  size  h'«li/2,  priaary  store  size 
l  «»  ]  containing  2o»'  H»  ]/ (N/2)  pages,  the  page  fetch 
function,  F* ,  resulting  froe  each  stead  estate  cycle, 
Pc»(n],  of  the  page  trace  P'  ha3  the  value  2n *2  (i.e., 
P'(  Pc'(  n  )  )*2n»2  luring  steady  state). 

Proof: 

The  proof  follows  directly  froe  the  definition  of  P', 
tae  LSU  properties,  aai  the  previous  Lessas. 

•  each  pago  referaaze  in  the  cyclic  pattern  Pc'(n]  is 
distinct.  (This  can  b«  easily  s»en  fros  the  definition  or 
provas  in  a  sleilar  sinner  to  Lessas  J-t  and  3-c). 

•  each  cycle  is  2s*2  references  leng. 

•  it  any  tlse  t,  page  reference  P*  (t)  *  l*'(t-2n~2). 

•  fha  pnniry  store,  n»,  can  hold  2n  pages  in  S'  sine* 


$• M/2. 
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*  Since  the  cyclic  pattern  only  repeats  after  2n*2  steps 
and  is  only  2n  pages  large,  M»  always  holds  the  last  2n 
page  references  (since  they  are  distinct) . 

"  Thus,  at  any  tine  t,  page  reference  p'(t)  will  not 
correspond  to  any  page  currently  in  N»  (i.e.,  ti»  nolds 
references  (  P'(t-I),  ...,  P»  (t-2n)  )  and  P*  (t)  =p*  (t-2n-2) 
is  not  in  that  set).  As  a  result,  a  page  fetch  is  reguirea 
for  every  page  reference. 

•  Since  there  are  2n*2  page  references  per  cycle,  there 
are  2n*2  page  fetches  reguired  per  cycle.  Thus,  F'*2n»2. 

Q.B.D. 

5. 5. 3. 6  Theorea  3: 

Por  any  two  deaanJ-fetch  LiO-reeoval  two-level  storage 
systeas,  s  and  S',  with  page  sizes  u  and  ■•■i/2  aad 
priaary  store  sizes  |fl‘|'-2|fl»),  respectively,  there 
exists  a  cyclic  page  trace,  P»(Pc)*,  where 
I  Pc | *2  (| H * |  ♦  1) ,  such  that  the  steady-state  page  fetch 
frequency  ratio,  /r/,  equals  | ft •  | ♦  1  • 

Proof: 

fhU  proof  follows  trivially  fcoe  Leeeas  J-e  and  3-t. 
es  now  that  for  each  steady-state  cycle  of  S,  (Loeea 
J-e).  Also,  for  each  steady-state  cycle  of  S*,  f*2n»2  (Leeaa 
J-fi.  Since  the  page  fetch  frwgucacy  ratio,  r,  is  defined  as 
fVt  or  (fV|P|)/(r/|P)»  which  equals  f/r,  we  find  that  in 
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/r/  =  F'/F  =  (2n  ♦2)/2  =  o*1. 


Q  •  L  •  U  . 


5.5.4  Coaaents  on  Theorea  3 


The  above  results  expose  another  facet  of  the  page  size 
anoaaly.  As  the  size  of  the  priaary  store,  tt* .  is  increased, 
the  overall  page  fetch  frequency  ratio  as  stated  in 

r 

Corollary  3a  also  incraases.  This  neans  that  tne  larger  the 
priaary  store  that  you  have,  the  aore  '‘dangerous"  the  page 
size  anoaaly  becoaes.  For  exaaple.  in  a  two-level  paging 
systea  based  on  devices  2  and  4  frca  Table  1.  |fl»|  =  128 
pages  and  M  =  4096  bytes,  if  the  page  size  is  decreased  by 
half  to  2048  bytes,  it  is  possible  that  the  page  fetch 
frequency  would  increase  129-told  (a  12. BOOS  increase  in 
paging  activity!).  Of  course,  one  would  assuae.  or  at  least 
hope,  that  such  pathological  page  trace  patterns  would  be 
vary  rare,  but  we  know  that  they  can  exist.  It  is 
iitorasting  to  note  that  tha  pathological  pattern  shown 
move  (o.g..  a*  o*  c*  z~  b~  a-)  corresponds  to  the  expected 
references  of  nested  suoroutine  calls  (i.e..  subroutine  a 
ciils  subroutine  b  which  calls  subroutine  c.  etc.,  and  each 
jjbroutine.  of  course,  returns  tc  its  caller).  This  is  also 
true  of  oth*»r  stack-like  prograa  constructs.  Such  highly 
•odjlar  prograa  lesiju  in  quite  typical  and.  furthermore,  it 
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of  tea  explicitly  encouraged,  in  view  of  Hatfield's  finding 
-i^rethe  overall  r  exceeded  2.0  in  .any  programs,  it  is 
reasonable  to  assume  that  there  were  probably  regions  in 
wu ich  r  was  guite  small,  possibly  below  1.0,  which  were 
counterbalanced  by  regions  with  very  high  values  of  r.  At 
present  we  do  not  have  this  particular  information 
available,  but  if  it  were  true,  performance  could  be  greatly 
improved  by  eliminating  the  high  r  value  regions.  This 
problem  will  be  discussed  in  the  next  section. 


5.5.5  Bounds  for  FIFO  aemoval  Algorithm 


Theorem  3  applies  to  LRU  removal  algorithms  and  many 
other  removal  algorithms,  although  these  other  cases  will 
aot  be  explicitly  proven  in  this  thesis,  it  is  interesting 
to  consider  whether  the  result  of  Theorem  3  applies  to  the 
PIF3  removal  algorithm.  Unfortunately,  due  to  the 
paculianties  of  PIFO,  a  simple  generalizable  cyclic  page 
trace  pattern  has  not  been  found.  But,  isolated  examples 
have  oeen  found,  as  illustrated  in  Figure  12,  that  show  that 
rt  is  possible  for  r  to  exceed  |H»|0.  This  result  is  stated 
in  Theorem  4.  Based  upon  other  examples, 
cnat  the  r,  when  FIFO 


removal  is  used. 


it  is  coojectured 
may  be  as  high  as 
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garaaaters 
As  sees  by  S: 


(£10) 


P  =a,c,a,b,b,c,c,a,a,b,b,c,c,a,a,b#bfc,c 
|P|  =19 

3  =  [  a,  b,  c  } 

131  =3 
|Ni  |=2 

PIPO  Benoval 


As  seen  by  S': 

•  P  =a+,c-,a",b+#b_fc  +  ,c*-#a4‘,a“,b+,b“#c*rc-#a+ra“,b+rb“fc*,c‘ 

•  I  PI  =19 

•  Q  =C  a*,  a-,  b* ,  b~ ,  c+,  c~  } 

•  1 31  =6 

•  | »» | =4 

•  PIPO  Renoval 


iUnlition 


Trace: 

-§- 
Patch: 
n» : 

Patch: 
N* : 


steady-state 

I  .  transient > |  <  .  cycle  -- - -  —  ; 

a*  c  a-  b*  b-  c*  c~  a*  a-  b*  b~  c*  c~  a ♦  a~  b*  b”  c*  c~ 


accbbbbaaaaccccbbbb 

aaccccbbbbaaaacccc 

******************* 

a  +  c-  a-  b*  b~  c f  c~  a*  a-  b*  b~  c*  c~  a*  a~  b+  b~  c*  c~ 

a*  c~  a~  b*  b~  c*  c~  a  +  a-  b*  b~  c*  c~  a*  a-  b*  b~  c* 

a+  c~  a~  b*  b~  c*  c~  a*  a~  b*  b~  c*  c~  a*  a~  b*  b~ 

a*  c~  a-  b*  b~  cf  c~  a*  a-  b+  b~  c+  c~  a4-  a~  b+ 


Hasults 


sane 


6 

19 

19/6  =  3.  16 


For  the  steady-state  cycle: 

•  F  =  3 

•  F*  =  12 

•  /r/  =  12/3  =4.0 


Figure  12, 

Cyclic  Page  Trace  with  FIFO  Removal 
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2|  n»|. 


rHEDBEH  4:  *  (tht) 

For  any  two  deaand-fetch  FlPO-reaoval  two-lovol  storage 
systeas,  S  and  S*,  with  page  sixes  N  and  N'»N/2  ana 
certain  priaary  store  sixes  |fl»|  and  m»  |  *>2|ai  j , 
respectively,  thare  exists  a  cyclic  page  trace,  P  - 
Ft*  (Pc)  *  where  |Pcj  «  2  ( |  m  |  ♦  1)  (|fl  i , ,  ,  seen  that  the 

page  fetch  freguancy  ratio,  r,  exceeds  |H»|0. 

Proof: 


By  exaaple  (Pigure  12). 
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uo 


cumi  t. 

SPATIAL  *S.  ?C4P JRAL  LOCALITY  *oCEl  Of  iROGRAd  HfUViUW 

*•0  ltU24i£U2t 

early  La  tlila  thesis  it  mi  eaplaiaed  that  a  aajor 
rttioaale  for  swktilevwl  storage  systess  La  based  upon  the 
Principle  of  Locality.  Uotortunatel y,  locality  is  still  a 
poorly  understood*  or  it  least  controversial*  paenostaoa.  In 
tats  chaptsr  sosa  novel  viewpoints  and  insights  will  bw 
pcaaaatad. 

6.  1  ClCMB  S(  ESiHCil  ltU*tfi££  tSiilUl 

Lot  u u  coasilar  two  aitrwaa  toras  of  progras  tafwranca 
Ideality  which  will  ba  called  tttPQral  l2£lllSI  And  |£Aliil 
liSiUli: 

6.1.1  Taaporal  Locality 

If  the  logical  addresses  (  **,  a',  ...  )  ara  refereased 
luring  the  tine  interval  t-T  to  t,  there  is  a  nign 
probability  that  these  sane  logical  addresses  will  bu 
referenced  during  the  tia*)  interval  t  to  t*T. 
fnis  behavior  cau  be  rationalized  by  prograa  constructs 
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* jch  loopa#  frequently  uaad  variables,  ana 
tregjtatiy  used  subroutines. 

<>•1.2  Spatial  Locality 

It  tha  logical  address  a  ia  referenced  at  tire  t,  theru 
10  a  ftijh  probability  that  a  logical  aadreso  in  the 
raa^a  a- A  to  a«A  nil  oe  referenced  at  tiaa  tal. 

Fhla  behavior  can  ba  rationalized  fcy  progcaa  constructs 
aucb  an:  sequential  iaatructioo  sequencing,  and  linear 
lata  atructaraa  (e.g. ,  arraya). 

b.1.3  Caoaral  Locality 

fba  datlaitloaa  of  taaporal  and  spatial  locality  above 
ira  quite  estreae.  Usually  va  consider  only  the  general 
spatiotaaporal  properties  and  define  locality  as: 

Locality 

If  the  logical  addresses  (  a*,  a*,  ...  )  are  referenced 
daring  the  tise  interval  t-T  to  t,  there  is  a  high 
probability  that  the  logical  addresses  in  the  ranges 
i*-A  to  •  *♦*#  a*-A  to  a**A,  ...»  sill  be  referenced 
luring  t&e  tine  interval  t  to  t*T. 

It  is  laportaot  to  recognize  that  teaporal  locality  an- 
opatial  locality  are  indeed  the  underlying  phenoaenon  and 
tnat  the  "general  locality"  is  surely  a  siaplifying  nerging 
and  olurnng  of  tnuse  basic  concepts. 
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b ,  2  i2!iy §Q tioaa i  Removal  Algorithms 

We  can  begin  to  understand  the  factors  causing  the  page 
size  anomaly  by  studying  how  the  various  conventional 
removal  algorithms  handle  temporal  and  spatial  locality.  In 
particular,  we  see,  that  whereas  temporal  locality  policies 
are  given  explicit  attention,  spatial  locality  policies  are 
usually  handled  implicitly  and  subtlely.  The  "least  recently 
used",  LHU ,  removal  algorithm,  for  example,  is  very  much 
concerned  about  the  temporal  aspects  of  the  program's 
reference  pattern.  The  spatial  aspects  are  handled  as  a 
by-product  of  the  fact  that  the  demand  fetch  algorithm  must 
load  an  entire  page  (i.e,  ,  a  spatial  region)  at  a  time  and 
LRU  removal  decisions  are  based  upon  these  pages*  With  these 
thoughts  in  mind,  we  can  see  that  decreasing  page  size 
causes  the  conventional  storage  management  algorithms  to 
increase  their  sensitivity  to  temporal  locality  and  decrease 
their  sensitivity  to  spatial  locality.  Increasing  page  size, 
of  course,  results  in  the  reverse  effect, 

b. 3  locality  in  Actual  Programs 

lany  of  the  techniques  for  improving  the  locality 
behavior  of  programs,  suca  us  the  method  or  automatic 
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pro^r*.  restructuring  by  sector  (subroutine)  reordering 
described  by  Hatfield  and  Herald  [47],  result  in  both 
ncreased  teiporal  and  spatial  locality.  But,  it  seess  that 
the  reordering  tachnigue  does,  in  fact,  significantly  ta.or 
spatial  locality  since  it  was  noted  [47]  that: 


"the  better  orderings  not  only  concentrate 

n^urPM  f Ct0rs  int°  pag€s'  but  tlese  pages  also 

naarnp1^  duster  into  larger  units  that  satisiv 
nearness  requirements  on  the  page  level  -  and 
cluster  better  than  do  the  pages  of  the  othe? 
orderings  ...  clustering  sectors  into  pages  also 
clusters  pages  into  larger  units." 


6.4  k2£4litjr  nixes 

An  effective  multilevel  storage  management  system  must 
take  both  temporal  and  spatial  locality  into  consideration. 
As  we  have  seen  from  both  Hatfield's  and  Selig.an's  results, 
neglecting  spatial  locality  can  have  disasterous  results. 
Any  given  program,  or  portion  of  a  program's  operation,  can 
have  its  reference  locality  characterized  by  the  two-by-tvo 
matrix : 

TEHPOaAL 

s  Low  High 

P 

A  Low 
T 

I  High 
A 
L 


low-spatial  locality,  is 


waadcant  1,  low-temporal  and 
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definitely  undesirable  for  operation  in  a  Multilevel  storage 
system.  There  have  baan  nuierous  algorithms  and  programmer 
training  techniques  developed,  as  mentioned  acove,  to 
minimize  the  number  ot  programs  with  these  poor  locality 
characteristics*  Quadrant  4,  high-temporal  and  high-spatial 
locality,  has  traditionaly  been  the  region  of  oest 
performance  and  is  usually  the  objective  of  good  program 
dasigu.  Unfortunately,  it  is  not  always  possible  or 
convenient  to  design  programs  which  attain  both  high 
taapor al  and  high  spatial  locality;  thus,  we  find  aany 
prograas  operating  in  quadrants  2  or  3. 

6.5  Se§tial  iocalit*  *lqor ithas 

Storage  aanageaent  techniques  are  needed  which  provide 
far  aore  flexibility  and  robustness  for  balancing  the 
systea's  sensitivity  to  temporal  and  spatial  locality*  These 
algorithas  must  explicitly  consider  the  spatial  locality  of 
a  program.  The  tuple-coupling  approach,  described  in  the 
next  chapter,  is  one  such  tecnnique*  It  takes  advantage  of 
the  temporal  locality  and  compactness  possible  with  small 
pages  characterized  by  quadrant  2  behavior,  yet  it  adjusts 
to  the  spatial  locality  and  clustering  characterized  by 
quadrant  3  behavior  by  simulating  the  removal  policies 
associated  with  large  pages. 
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6.6  Coaae&t  oa  the  Page  Size  Ancaaly 

Hith  this  insight,  we  can  now  see  that  the  page  size 
anoaaly  is  not  really  even  a  function  strictly  of  page  size! 
Instead,  it  is  an  issue  of  locality,  tenpcrai  versus 
spatial. 
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CHAPTER  7. 

SPATIAL  REMOVAL  STORAGE  MANAGEMENT  ALGORITHMS 

7 . 0  Introduction 

As  stated  earlier  in  this  thesis  and  noted  by  Hatfield, 
a  removal  algorithm  that  would  limit  the  page  fetch 
frequency  ratio,  r,  to  2  would  be  very  desirable.  In  this 
section  a  technique,  called  the  "tuple-coupling  approach", 
is  described  which,  when  used  in  conjunction  with 
conventional  removal  algorithms,  such  as  LRU  or  FIFO, 
guarantees  that  r  will  not  exceed  2, 

7.  1  iafiie-.CouBliaa  Approach 

The  basic  concept  behind  the  tuple-coupling  approach  is 
extremely  simple.  First,  the  two  portions,  p*  and  p~,  or 
each  original  larger  page,  p,  must  be  identifiable  (i.e., 
tne  set  of  pages  oE  S'  are  viewed  as  a  collection  of 
2-tupies).  Second,  the  removal  ordering  policies  must  be 
appLied  to  both  elements  of  a  tuple  (i«6«,  the  tuples  are 
coupled  in  regard  to  ordering  decisions)  such  teat  a  page  p+ 
or  p-  of  S'  is  never  removed  unless  the  corresponding  page  p 
o£  S  would  also  have  been  removed  from  M',  The  particular 
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implementation  of  this  approach  may  vary  slightly  depending 
upon  the  rr  /al  algorithm,  e.g.,  LBU,  FIFO,  etc.,  that  is 
to  he  u  .ed.  Any  removal  algcritna  to  wmch  tne 
tuple-coupling  approach  can  be  incorporated  is  said  to  be 
"tuple-couple-able". 

7.  1.  1  An  Example  of  L2U  Tuple-Coupling 


Figure  13 

illustrates 

the 

application 

of 

the 

tuple-coupling 

approach  to 

the 

LBU  removal 

exam 

pie 

previously  shown 

in  Figure  9. 

It 

should  be  noted 

that. 

in 

this  case,  r  has  indeed  been  limited  to  2  although  it  had  a 
value  of  2.2  when  normal  LHU  removal  was  used.  The  reader 
should  carefully  compare  Figures  7  and  11  to  understand  how 
the  tuple-coupling  approach  affects  the  removal  algorithm. 
The  contents  are  identical,  of  course,  for  S  in  both 
examples,  but  there  are  subtle  differences  in  contents 
for  S'.  Each  state  of  a1  contents  is  marked,  1  to  11,  in 
Figure  13  for  reference  purposes.  Notice  that  in  this 
implementation  of  tuple-coupling  whenever  both  halves  of  a 
page,  p*  and  p-,  are  in  «*,  they  are  always  adjacent  in  the 
fl1  ordering;  compare  this  with  Figure  9. 

At  page  trace  step  3  we  can  see  the  first  difference 
batwean  Figures  7  and  11.  Page  a~  is  referenced  and  must  be 
ratchad  in  Figure  y,  it  is  then  placed  at  the  top  of  the  rt» 
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Parameters 
As  seen  by  S: 

•  P  =  a,  b,  a#  b ,  c,  c,  b#  a#  a,  c#  c' 

•  I  PI  =  11 

•  Q  =  (  a»  b,  c  ) 

•  IUI  =  3 

•  I  N 1 1  =  2 

•  LBU  Beaovai 


As  seen  by  S': 

•  P  =  a*,  b+,  a~#  b~,  c*,  c~ ,  fc*f  a*-,  a*#  c* ,  c~ 

•  I P I  =11 

•  Q  =  (  a*#  a-,  b+,  b~ ,  c+,  c~  } 

•  IQI  =  6 

•  |S»|  =  4 

•  LHU  Beaovai  with  Tuple~C oupling 


Simulation 


1 

2 

3 

« 

s 

* 

7 

a 

* 

1  0 

a  i 

Page  Trace:  a* 

_s_ 

b* 

a~ 

b- 

c* 

c~ 

b* 

a* 

a~ 

c* 

c~ 

Fetch:  * 

* 

* 

a 

a 

H 1  Contents:  a 

b 

a 

b 

c 

c 

b 

a 

a 

c 

c 

5  I 

a 

b 

a 

b 

b 

c 

b 

b 

a 

a 

J 

Fateh:  * 

* 

a 

* 

a 

* 

a 

a 

a 

a 

d1  Contents:  a* 

b* 

a- 

b- 

c* 

c- 

b* 

a* 

a- 

c* 

c~ 

a* 

a* 

b* 

b- 

c* 

b- 

b* 

a* 

a- 

c* 

b* 

a~ 

b* 

fa¬ 

C” 

b- 

b* 

a* 

a~ 

a* 

a- 

ta* 

c* 

C“ 

b~ 

b* 

a* 

Results 

•  F  =  5 

•  F'  =  10 

•  r  =  10/5  =  2.0 


Figure  13. 

Example  of  LRU  Beaovai  with  luple-coupling 
(see  Figure  9  for  coaparison) 


Storage  Hierarchy  Systems 


1  29 


ordering  vhich  becomes  a~,b*,a*.  On  the  other  nand,  in 
Pigara  13  at  step  3,  it  is  noticed  that  a*  was  already  in 
M1.  rnus,  when  a-  is  placed  at  the  top  of  the  ordering, 
a*  is  coupled  to  it  resulting  in  the  ordering  a~,a*,b*.  At 
page  trace  step  7  of  Figure  13  we  see  another  interesting 
example  of  the  tuple-coupling  approach.  At  the  previous  step 
the  ordering  was 

c-  c+  b-  b* 

-hen  the  reference  to  b*  is  made,  there  is  nc  need  to 
initiate  a  fetch  since  b*  is  already  in  H».  The  n a  ordering 
then  becomes 


b*  b~  c~  c 

since  LHU  requires  that  the  most  recent  reference  nove  to 
tae  top.  Under  this  tuple-coupling  scheme,  b-  is  also  noved 

toward  the  top  of  the  ordering  to  continue  to  be  adjacent  to 
b*. 


7,1,2  Implementation  of  the  Tuple— Coupling  Approach 

It  is  important  to  note  that  there  are  often  various 
ways  to  implement  tuple-coupling.  In  particular,  in  the  LBU 
tuple- coupling  algorithm  described  above,  the  2-tuples, 
wnenever  both  portions  were  in  were  arranged  to  be 
adjacent  in  the  reaoval  ordering.  The  requirement  that 
neither  portion,  p* 


or  p~,  of  a  tuple  in  S'  be  removed 
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unless  the  corresponding  page  of  s  would  have  been  removed 
can  be  accomplished  in  other  ways.  For  example,  the  LHU 
removal  stack  can  be  left  in  its  normal  ordering,  as  in 
Figure  9.  In  this  case,  when  it  is  necessary  to  remove  a 
page  from  S'  the  bottom  page  is  not  necessarily  the  correct 
choice  to  satisfy  tuple-coupling.  There  is  an  algorithm 
which  can  scan  the  LBU  stack  and  select  the  correct  page  for 
removal  (in  fact,  it  will  select,  of  course,  the  same  page 
selected  by  the  algorithm  illustrated  in  Figure  13). 

7.1.3  An  Example  of  FIFO  Tuple-Coupling 

It  is  interesting  to  consider  the  effect  of 
tuple-coupling  upon  FIFO  removal.  Figure  14  illustrates  the 
application  of  the  tuple-coupling  approach  to  the  FIFO 
removal  example  previously  shown  in  Figure  6.  Once  again, 
the  page  fetch  fregueucy  ratio,  r,  which  originally  was  2.75 
has  indeed  been  limited  to  2.  The  example  of  Figure  14  does 
not  fully  illustrate  all  the  interesting  aspects  of 
tuple-coupling  upon  FIFO  removal.  In  particular,  if  page  p+, 
roc  example,  is  referenced  in  a  page  trace  and  it  was  not 
already  in  M»,  it  must  be  fetched.  The  contents  are 
reordered  as  follows: 

1.  If  P~  is  not  currently  H»,  p*  is  placed  at  the  top 
of  the  FIPO  ordering. 

if  P~  is  currently  in  H*,  p*  is  placed  immediately 
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Paraadtejrs 
As  seen  by  S: 

•  P  *  a,  b,  a,  b,  c,  c,  b,  a,  a,  c,  c 

•  I  PI  =11 

•  Q  *  {  a,  b,  c  } 

•  IQ  I  =  3 

•  JHM  =  2 

•  FIFO  Removal 

As  seen  by  s*: 


(£12) 


P  =  a*,  b*,  a~,  b~,  c*.  c~' .  b*,  a*,  a-,  c* .  c- 
I  PI  =  11 

a  =  C  a+,  a-#  b*,  b-,  c+,  c~  } 

IQI  «  6 

|H»|  =  4 

FIFO  Heaoval  with  Tuple-Coupling 


SUiiiiion 


12  3 

Page  Trace:  a*  b+  a- 

-2. 

Fateh:  *  ♦ 

a*  Contents:  abb 

a  a 

-SI. 

Fetch:  *  *  * 

H1  Contents:  a*  b*  a- 

a*  a  + 
b* 


♦  *  *  i  ■  *  1011 

b~  c*  c-  b*  a*  a-  c ♦  c~ 

*  * 

bcccaaa  a 
abbbcccc 

*  *  *  *  * 

b~  c*  c-  c-  a*  a-  a-  a~ 

b*  b-  c*  c+  c~  a*  a-f  a  + 

a-  b*  b~  b~  c*  c”  c~  c~ 

a*  a-  b*  b+  b~  c*  c*  c * 


Results 

•  F  =  4 

•  F'  =  8 

•  r  =  8/4  =  2.3 


Figure  14. 

Example  of  FIFO  Removal  with  Tuple-Coupling 
(see  Figure  8  for  comparison) 
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oefore  p-  in  the  logical  PIFO  ordering 
p-'s  relative  ordering  reaains  unchanged, 
rae  reason  for  the  second  part  of  this  rule  can  he  seen  troa 
the  noraal  PIPO  ordering  rule  which  places  a  page  p  at  the 
top  only  if  it  were  not  already  in  R».  If  it  were  an  n ‘ ,  it 
reaains  at  its  previous  ordering  position.  Under 

tuple-coupling,  this  rule  applies  jointly  to  the  (p*#p~) 
tuple  as  stated  above.  The  reader  is  encouraged  to  worn 
through  the  example  of  Pigure  10  using  the  tuple-coupling 
approach  to  illustrate  this  PIFO  ordering  phenoaenoa.  The 
effect  of  the  tuple-coupliug  approach  is  suaaarized  in 
Theorem  5. 

(th5) 

THEOBEH  5: 

Por  any  two  demand-fetch  two-level  storage  systeas,  S 
and  S',  with  page  sizes  N  and  N'  =  N/2,  respectively,  the 
'use  of  the  "tuple-coupling"  approach  for  S'  in 
conjunction  with  a  removal  algorithm  that  is 
" tuple-couple-able"  is  sufficient  to  guarantee  that  the 
page  fetch  frequency  ratio,  r,  cannot  exceed  the  value 
2  for  all  possible  page  traces,  P. 

P roof : 


(See  below) 
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7.1.4  Proof  of  Theorea  5 

As  described  earlier,  -hen  an  adress  trace,  A,  ls 
applied  to  storage  systeas  S  (with  page  size  N)  and  S'  (with 
page  size  H'*jl/2),  it  can  be  represented  as  page  traces  £ 
and  P',  respectively .  At  tiae  t»,  let  us  consider  a  specific 
aldress  reference,  a,  whose  corresponding  page  references 
are  p  (in  S)  and  p*  (in  S'),  in  processing  this  reference 
there  are  four  possible  fetch  actions  in  s  and  S'  depending 
upon  the  current  content  state  of  priaary  store,  r»: 


State 

Pa9e  P  (S)  |  page  pa  (S')| 

1_L 

n 

1 

effect 

1 

in  R» 

wamm 

o 

a 

0 

r  =>  1 

in  M* 

not  in  R i 

o 

1 

r  *>  >1 

not  in  fli 

DQBII 

a 

-1 

not  in  M * 

not  in  Ri 

ID 

0 

Becali  that  the  page  fetch  frequency  ratio,  r,  eguals 
f/r.  In  states  1  and  4  the  sane  action  (i.e.,  no  page  fetch 
ia  1  and  a  page  fetch  in  4)  occurs  in  both  S  and  S',  the 
occurrence  of  these  states  cause  r  to  tend  towards  1.  in 
state  3,  a  page  fetcn  is  required  in  S  but  not  in  S',  this 
situation,  if  frequent,  will  cause  r  to  decrease  towaru 
aero.  This  is  usually  the  intended  result  of  reducing  page 
size.  Only  state  2,  in  which  S'  alone  requires  a  page  retch, 
contributes  to  an  increase  in  r.  Thus,  we  will  concentrate 
our  analysis  on  this  particular  situation. 
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Since  state  2  requires  that  page  p  be  in  a»  at  tine  t 1 , 
if  we  scan  the  address  trace  backwards,  tnere  aust  be  some 
previous  reference  tiae  t*  that  caused  page  p  (in  S)  to  be 
fetched  into  H*  (this  aay  have  been  the  only  previous 
reference  to  p  or  tne  page  p  aay  have  been  fetched  and 
reaoved  aany  tines) .  At  tine  t*,  there  aust  also  be  a 
corresponding  reference  to  either  p-  and  p*  of  S',  These  two 
cases  will  be  considered  separately: 
wise  1:  P  *  , , ,  p  •••  p 

P'  =  ...  p-  ...  p* 
t  =  ...  t  z  ...  1 1 

This  case  aerely  illustrates  the  fact  that  it  can 
require  two  page  fetches  (for  p*  and  p-)  in  S'  to  fetch  the 
saae  mount  of  storage  as  page  p  in  S.  If  this  were  the  only 
case  for  state  2,  r  would  never  exceed  2. 

Case  2:  P  =  . . .  p  ...p 

£'  =  ...  p*  ...  p* 

t  ~  ...  t  z  ...  1 1 

In  this  case  we  see  that  subsequent  to  reference  t* 
page  p  of  S  and  page  p+  on  s'  aust  be  in  H 1 .  Yet  at  tiae  t » 
page  p  of  S  is  still  in  H‘  but  page  p*  of  s'  is  not.  Under 
taese  circuastances  r  can  certainly  exceed  2,  aerely  nixing 
P"  the  next  reference  will  account  for  j  fetches  in  S' 
coauared  to  1  fetch  in  S.  Furthermore,  it  is  possible  that 
tae  references  between  t*  and  t*  could  be  repeated  to 
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continually  cause  fetches  for  p*  In  s.  Ulth0ut  duy 
corresponding  fetches  required  in  s.  Thus,  ,e  see  that  this 
ra  precisely  the  situation  that  allows  r  to  exceed  2. 

Jnder  closer  analysis,  >e  see  that  this  situation 
requires  that  in  S'  pa  be  re.oyed  frc.  n '  between  t*  and  t* 
whereas  in  s  p  re.aius  in  a*.  In  other  words,  this  general 
situation  can  only  occur  if  at  none  tine  t,  p*  or  p-  of  S' 
is  selected  for  renewal  fro.  a*  and  the  corresponding  page  p 
Of  S  is  not  also  renewed  fro.  a*.  But,  the  tuple-coupling 
algorith.  (see  page  125)  is  "such  that  a  page  p»  or  p-  of  s' 
newer  renowed  unless  the  corresponding  page  p  of  s  would 
also  hawe  been  renewed  fro.  o'".  Thus,  the  tuple-coupling 
elininates  the  possibility  of  case  2  and  therefore 
guarantees  that  r  cannot  exceed  2. 


tt.E.D. 


7,2  ££fasti»aags  a£  XaElszcoupiina 


Clearly,  the  tuple-coupling  approach  has  an  influence 
upon  tne  owerall  ettectiweness  of  the  basic  re.owal 
algorith.  being  used  and  the  benefits  of  the  sialler  page 
sue.  it  is  obwious  that  there  are  certain  reference 
patterns  (with  r  less  than  2)  for  which  tuple-coupling 
increases  the  walue  of  r.  on  the  other  hand,  it  can  ne 
suown,  as  a  si. pie  exercise  for  the  reader,  that  the  exa.ple 
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of  Figure  6  retains  its  low  page  fetch  frequency  ratio  of 
3.5  even  when  tuple-coupling  is  used.  In  tact, 
tupla-coupling  nay  oftan  result  in  the  ’’best  of  both  worlds" 
by  placing  a  bound  on  the  page  fetch  frequency  ratio,  r,  ror 
high  r  regions  without  interfering  with  the  performance  of 
originally  low  r  regions. 

k  program's  reference  behavior  in  S',  during  a  short 
interval  of  its  operation,  nay  be  characterized  by  three 
regions  based  upon  the  value  of  the  page  fetch  frequency 
ratio,  r,  when  tuple-coupling  is  not  used: 

1.  Sparse  reference  -  snail  r  (e.g.,  less  than  1). 

2.  Moderate  reference  -  noderate  r  (e.g.,  between  1 
and  2)  . 

3.  Danse  reference  ~  high  r  (e.g.,  greater  than  2). 

In  the  sparse  reference  region,  it  is  unlikely  that  both 
portions,  p*  and  p-,  of  a  page,  p,  will  be  in  M4 
simultaneously ;  thus,  the  tuple-coupling  will  have  nininal 
affect  upon  perfornanca.  In  the  dense  reference  .region,  we 
have  already  seen  that  tuple-coupling  prevents  extreme 
values  of  r.  Based  upon  some  recent,  though  United, 
xaasur aments ,  it  appaars  that  in  the  noderate  reference 
region  tuple-coupling  perforns  about  as  well  as  the 
non- tu pie- coupled  algorithns. 
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CHAPTEH  8. 

DISCUSSION  AND  CONCIUSICNS 


8.0  iat£23a2ti2B 

Efficient  and  effective  storage  aanageaent  is  i.portant 
to  the  develop.ent  of  future  coaputer  systeas.  it  has  been 
astiaated  that  the  storage  subsysteas  account  for  over  70h 
of  the  cost  of  aost  conteaporary  installations  and,  based 

upon  present  trends,  this  percentage  is  expected  to 
increase. 

Huch  aore  research  nil  be  needed  before  all  the 
problaas  of  auto.atic  storage  aanageaent  are  understood  and 
the  obstacles  to  effective  operation  eliainated.  This 
thesis  has  solved  several  open  probleas  and  has  provided 

insight  that  should  lead  to  the  solution  of  aany  .ore 
probleas. 

3.  1  Suaaary 

A  detailed  discussion  of  the  .any  racets  of  storage 
aanageaent  is  presented  in  Chapter  2.  It  also  contains  a 
general  discussion  of  the  reguireaents  which  a  systea  ausi 
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satisfy  to  be  effective  for  the  user. 

In  chapters  J  and  4  a  model  for  storage  nierarcny 
systeas  is  foraalized  and  an  implementation  is  proposed*  The 
system's  design  is  based  upon  an  orderly  and  uniform 
treatment  of  the  storage  levels.  Specific  techniques  to 
laprove  performance,  such  as  continuous  hierarchy,  shadow 
storage,  direct  transfer,  read  through,  store  behind,  and 
automatic  management,  are  explained. 

In  Chapter  5  the  "page  size  ancmaly"  is  presented  (see 
also  Hatfield  [48  j) : 

"The  assumption  about  virtual  memory  systems  that  as 
overhead  (time  for  access  and  software  page 
management)  decreases  page  size  should  be  reduced  is 
not  always  a  good  one.  Kecent  experiments  indicate 
that  larger  sizes  can  provide  better  performance  for 
programs  that  make  highly  localized  use  of  memory 
space. " 

This  phenomenon  is  formalized  and  a  bound  on  the  performance 
is  proven. 

In  Chapters  6  ani  7  the  concept  of  spatial  locality  is 
introduced  and  serves  as  the  basis  for  a  new  storage  removal 
algorithm  called  "tuple-coupling".  These  concepts  are  used 
to  explain  the  occurrence  of  the  "page  size  anomaly"  m 
actual  systeas.  It  is  proven  that  the  tuple-coupling 
approach  is  a  sufficient  strategy  to  avoid  the  occurrence  of 
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t,e  "P1JC  si“  it  ctlcre  potential  perfcr.anco 

laproveaon ts  foe  the  storage  hierarchy  systen. 

The  techniques  and  theoreas  presented  in  tnis  thesis 
protile  a  auch  .ore  scientifically  sound  basis  tor  eia.iniug 
and  dasigninq  storage  hierarchy  systeas  than  aost  current  ad 
US  approacnes.  although  there  is  still  a  long  way  to  go, 
davelopaent  ot  these  for.alis.s  is  essential  to  the 
advancing  of  the  "sctauce"  in  Coaputer  Science. 

There  are  .any  areas  touched  on  by  this  work  in  which 

questions  reaain.  one  of  the  aost  significant  is  in  the 

davelopaent  and  study  of  other  possible  -spatial  locality- 

raaoval  alqorith.s  i„  addition  to  the  tuple-coupling 

approach  studied  in  this  thesis.  This  is  an  entirely  wide 
open  area. 

Although  tuple-coupling  iS  studied  extensively  in  this 
thesis,  there  are  still  many  unanswered  questions.  Hew  does 
tuple-coupling  compare  with  the  class  of  "stacx"  algorithms 
studied  by  flattson  [*3],  in  particular  under  what 

ctrcuxotances,  if  any,  is  tuple -coupling  a  stack  algorithm? 
Likewise,  how  Joes  tuple-coupling  coapare  with  the 
theoretically  optimal  replacement  algorithm,  called  OPT  (bij 
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ac  HIM  [12J?  On  a  aora  practical  side,  how  etticiently  can  a 
tuple-coupling  algorithm,  or  other  spatial  removal 
algorithms,  be  iapleaanted? 

In  order  to  ascertain  specific  procf  oi  the  utility  and 
efficiency  of  geuenl  storage  hierarchies,  it  will  be 
necessary  to  actually  construct  and  aeasure  the  pertoraance 
of  such  a  systea  or,  at  least,  perfora  aore  extensive 
siaulation  analysis.  Furthermore,  we  aust  develop  overall 
prognaaing  technigues  and  execution  environments  that  are 
even  aore  amenable  tc  efficient  operation  in  a  storage 
hierarchy  systea. 

Hany  of  these  guestions  are  currently  under 
investigation,  the  results  vill  be  published  later  in  a  HIT 
Project  HAC  Technical  Report. 
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